Snowflake Stored Procedures & UDFs — Expert Interview Questions

AdvancedLast updated: 2026-04-27 • 4 sections

Expert questions on stored procedures, JavaScript/Python/SQL UDFs, caller vs owner rights, transaction management, and security considerations.

Key Facts for Stored Procedures & UDF Interviews

  • Stored procedures: imperative programs, CALL statement, can execute DDL/DML, can modify data.
  • UDFs: declarative, called in SELECT, return one value per row, cannot modify data.
  • UDTFs: return multiple rows per input row, called with TABLE(), used for exploding or inference.
  • Owner rights (default): procedure runs with the owner role privileges. Caller rights: runs with the caller role.
  • Supported languages: SQL, JavaScript, Python, Java, Scala for procedures. SQL, JavaScript, Python, Java for UDFs.
  • Python UDFs and procedures can import third-party packages from Anaconda repository.
  • Snowpark-optimized warehouses: use for Python UDFs/procedures with heavy pandas/ML workloads.

Procedures, UDFs, and UDTFs

Q: What is the difference between a stored procedure, UDF, and UDTF in Snowflake?

Stored Procedure: executes a program with SQL statements, DDL, DML, transaction control. Called with CALL. Can modify data. Use for: ETL orchestration, admin automation, multi-step workflows. UDF (User-Defined Function): takes input, returns one value per row. Called in SELECT. Cannot modify data. Use for: row-level transformations, custom calculations. UDTF (User-Defined Table Function): returns multiple rows per input. Called with TABLE(). Use for: exploding data, ML inference per row group, custom aggregations returning sets.

Q: What is the difference between caller rights and owner rights procedures?

Owner rights (default): the procedure executes with the privileges of the role that OWNS it, regardless of who calls it. Allows controlled data access: grant CALL to a low-privilege role without granting direct table access. Caller rights (EXECUTE AS CALLER): the procedure runs with the calling role's privileges. Use for: utility procedures that should respect the caller's permission boundary. Security risk: owner-rights procedures with broad owner role privileges are a lateral-movement vector — audit all EXECUTE AS OWNER procedures that touch PII.

Q: How do you handle transactions inside a Snowflake stored procedure?

Snowflake procedures support explicit transaction control. BEGIN TRANSACTION; ... COMMIT; or ROLLBACK; wraps a block. In Snowflake Scripting (SQL procedures), use: BEGIN; ... EXCEPTION WHEN ... DO ROLLBACK; END;. In Python/JavaScript: use the execute() API with BEGIN/COMMIT/ROLLBACK strings. Important: DDL statements (CREATE, ALTER, DROP) implicitly commit any open transaction — cannot include DDL in a transactional block. Each stored procedure call runs in its own implicit transaction by default if no explicit transaction is opened.

Q: How do you pass arrays, objects, and complex types to procedures and UDFs?

Use VARIANT as the parameter type for complex inputs. Caller passes: CALL my_proc(PARSE_JSON('[1, 2, 3]')), procedure receives: VARIANT param. For Python UDFs/procedures: VARIANT maps to a Python dict or list automatically via the Snowpark type system. For returning complex types: RETURNS VARIANT and return an OBJECT_CONSTRUCT or Python dict. Typed arrays (ARRAY) and objects (OBJECT) can also be used but VARIANT is more flexible for ad-hoc shapes.

Q: How do you write a Python UDF that uses a third-party package?

In the CREATE FUNCTION statement: add PACKAGES = ('pandas==2.0.0', 'scikit-learn') and set the RUNTIME_VERSION. Packages come from the Snowflake Anaconda channel (curated, security-reviewed list). For packages not in the Anaconda channel: upload the package as a zip to a stage and reference it with IMPORTS. If heavy ML inference is needed (pandas apply, sklearn predict), use a Snowpark-optimized warehouse for significantly better performance.

Security, Performance, and Testing

Q: How do you test stored procedures and UDFs safely in Snowflake?

For procedures: use a separate dev/staging database that mirrors production schema. Wrap procedure calls in an explicit transaction and rollback after testing: BEGIN; CALL my_proc(); SELECT * FROM affected_table; ROLLBACK; For UDFs: test with SELECT my_udf(test_col) FROM test_table LIMIT 100. For Python: develop locally with Snowpark session connected to dev, then deploy to Snowflake. Use DESCRIBE PROCEDURE and SHOW PROCEDURES to inspect deployed code.

Q: What are the security risks of stored procedures and how do you mitigate them?

Risk 1: overly privileged owner-rights procedures — an attacker with CALL privilege gains the owner role capabilities. Mitigation: minimize the owner role's privileges, use a dedicated service role. Risk 2: SQL injection via string concatenation in dynamic SQL. Mitigation: use parameterized queries, never concatenate user input directly into SQL strings. Risk 3: exfiltration via external stages in JavaScript procedures. Mitigation: audit procedures for PUT/GET commands, use network policies to block unauthorized external access.

Q: When should you use a Python procedure vs a Snowpark Python DataFrame job?

Python stored procedure: runs inside Snowflake, no external orchestration needed, called via CALL or from a Task. Best for: automated data transformations triggered by a schedule, admin automation, lightweight ML inference. Snowpark Python script (external): runs in your orchestration layer (Airflow, Glue), connects to Snowflake via the Snowpark session. Best for: complex orchestration with external dependencies, large-scale ML training, workflows that also interact with non-Snowflake systems.

Stored Procedures & UDF Interview Prep Checklist

Frequently Asked Questions

Can Snowflake stored procedures replace a workflow orchestrator like Airflow?

For simple, self-contained data pipelines within Snowflake, yes — Snowflake Tasks + stored procedures can schedule, chain, and conditionally branch workflows without Airflow. For complex scenarios (cross-system dependencies, complex retry logic, external API calls, many-task DAGs with dynamic branching), Airflow is still superior. The practical guideline: if the workflow only touches Snowflake objects and has < 10 steps, Tasks + procedures is simpler. More complex, or cross-system, use Airflow or a dedicated orchestrator.

What is the performance difference between JavaScript and Python UDFs in Snowflake?

JavaScript UDFs: compiled and run natively within the Snowflake query engine, low overhead per row, good for simple row-level transformations. Python UDFs: run in a separate Python sandbox, higher startup overhead, but can leverage vectorized execution with pandas (process batches of rows at once) which is significantly faster for complex operations. For ML inference: Python UDFs are often 10–50× faster than equivalent JavaScript UDFs when using vectorized pandas apply. Always benchmark for your specific workload.

How do you handle errors and logging in Snowflake stored procedures?

Error handling: use TRY/CATCH in JavaScript, try/except in Python, or EXCEPTION WHEN in Snowflake Scripting SQL procedures. Log errors by inserting into a dedicated log table: INSERT INTO procedure_log (procedure_name, error_msg, run_time) VALUES (...). For structured observability, use Snowflake's Event Table + SYSTEM$LOG / SYSTEM$TRACE functions to emit structured log events and spans that appear in the Snowsight Monitoring UI. Event Table logging is the production standard for procedure observability.

Related Cheat Sheets

Top 30 Snowflake Interview Questions & AnswersSnowflake Semi-Structured Data — Expert Interview QuestionsSnowflake Cortex AI & ML — Expert Interview Questions
← All Cheat Sheets