AdvancedLast updated: 2026-04-27 • 4 sections
Expert questions on stored procedures, JavaScript/Python/SQL UDFs, caller vs owner rights, transaction management, and security considerations.
Q: What is the difference between a stored procedure, UDF, and UDTF in Snowflake?
Stored Procedure: executes a program with SQL statements, DDL, DML, transaction control. Called with CALL. Can modify data. Use for: ETL orchestration, admin automation, multi-step workflows. UDF (User-Defined Function): takes input, returns one value per row. Called in SELECT. Cannot modify data. Use for: row-level transformations, custom calculations. UDTF (User-Defined Table Function): returns multiple rows per input. Called with TABLE(). Use for: exploding data, ML inference per row group, custom aggregations returning sets.
Q: What is the difference between caller rights and owner rights procedures?
Owner rights (default): the procedure executes with the privileges of the role that OWNS it, regardless of who calls it. Allows controlled data access: grant CALL to a low-privilege role without granting direct table access. Caller rights (EXECUTE AS CALLER): the procedure runs with the calling role's privileges. Use for: utility procedures that should respect the caller's permission boundary. Security risk: owner-rights procedures with broad owner role privileges are a lateral-movement vector — audit all EXECUTE AS OWNER procedures that touch PII.
Q: How do you handle transactions inside a Snowflake stored procedure?
Snowflake procedures support explicit transaction control. BEGIN TRANSACTION; ... COMMIT; or ROLLBACK; wraps a block. In Snowflake Scripting (SQL procedures), use: BEGIN; ... EXCEPTION WHEN ... DO ROLLBACK; END;. In Python/JavaScript: use the execute() API with BEGIN/COMMIT/ROLLBACK strings. Important: DDL statements (CREATE, ALTER, DROP) implicitly commit any open transaction — cannot include DDL in a transactional block. Each stored procedure call runs in its own implicit transaction by default if no explicit transaction is opened.
Q: How do you pass arrays, objects, and complex types to procedures and UDFs?
Use VARIANT as the parameter type for complex inputs. Caller passes: CALL my_proc(PARSE_JSON('[1, 2, 3]')), procedure receives: VARIANT param. For Python UDFs/procedures: VARIANT maps to a Python dict or list automatically via the Snowpark type system. For returning complex types: RETURNS VARIANT and return an OBJECT_CONSTRUCT or Python dict. Typed arrays (ARRAY) and objects (OBJECT) can also be used but VARIANT is more flexible for ad-hoc shapes.
Q: How do you write a Python UDF that uses a third-party package?
In the CREATE FUNCTION statement: add PACKAGES = ('pandas==2.0.0', 'scikit-learn') and set the RUNTIME_VERSION. Packages come from the Snowflake Anaconda channel (curated, security-reviewed list). For packages not in the Anaconda channel: upload the package as a zip to a stage and reference it with IMPORTS. If heavy ML inference is needed (pandas apply, sklearn predict), use a Snowpark-optimized warehouse for significantly better performance.
Q: How do you test stored procedures and UDFs safely in Snowflake?
For procedures: use a separate dev/staging database that mirrors production schema. Wrap procedure calls in an explicit transaction and rollback after testing: BEGIN; CALL my_proc(); SELECT * FROM affected_table; ROLLBACK; For UDFs: test with SELECT my_udf(test_col) FROM test_table LIMIT 100. For Python: develop locally with Snowpark session connected to dev, then deploy to Snowflake. Use DESCRIBE PROCEDURE and SHOW PROCEDURES to inspect deployed code.
Q: What are the security risks of stored procedures and how do you mitigate them?
Risk 1: overly privileged owner-rights procedures — an attacker with CALL privilege gains the owner role capabilities. Mitigation: minimize the owner role's privileges, use a dedicated service role. Risk 2: SQL injection via string concatenation in dynamic SQL. Mitigation: use parameterized queries, never concatenate user input directly into SQL strings. Risk 3: exfiltration via external stages in JavaScript procedures. Mitigation: audit procedures for PUT/GET commands, use network policies to block unauthorized external access.
Q: When should you use a Python procedure vs a Snowpark Python DataFrame job?
Python stored procedure: runs inside Snowflake, no external orchestration needed, called via CALL or from a Task. Best for: automated data transformations triggered by a schedule, admin automation, lightweight ML inference. Snowpark Python script (external): runs in your orchestration layer (Airflow, Glue), connects to Snowflake via the Snowpark session. Best for: complex orchestration with external dependencies, large-scale ML training, workflows that also interact with non-Snowflake systems.
For simple, self-contained data pipelines within Snowflake, yes — Snowflake Tasks + stored procedures can schedule, chain, and conditionally branch workflows without Airflow. For complex scenarios (cross-system dependencies, complex retry logic, external API calls, many-task DAGs with dynamic branching), Airflow is still superior. The practical guideline: if the workflow only touches Snowflake objects and has < 10 steps, Tasks + procedures is simpler. More complex, or cross-system, use Airflow or a dedicated orchestrator.
JavaScript UDFs: compiled and run natively within the Snowflake query engine, low overhead per row, good for simple row-level transformations. Python UDFs: run in a separate Python sandbox, higher startup overhead, but can leverage vectorized execution with pandas (process batches of rows at once) which is significantly faster for complex operations. For ML inference: Python UDFs are often 10–50× faster than equivalent JavaScript UDFs when using vectorized pandas apply. Always benchmark for your specific workload.
Error handling: use TRY/CATCH in JavaScript, try/except in Python, or EXCEPTION WHEN in Snowflake Scripting SQL procedures. Log errors by inserting into a dedicated log table: INSERT INTO procedure_log (procedure_name, error_msg, run_time) VALUES (...). For structured observability, use Snowflake's Event Table + SYSTEM$LOG / SYSTEM$TRACE functions to emit structured log events and spans that appear in the Snowsight Monitoring UI. Event Table logging is the production standard for procedure observability.