Snowflake Query Tuning — Expert Interview Questions

AdvancedLast updated: 2026-04-27 • 4 sections

Expert questions on micro-partition pruning, clustering keys, search optimization, query profiling, spilling to disk, and the three caching layers.

Key Performance Facts for Interviews

  • Three cache layers: Result cache (24hr, free), Local disk cache (warehouse SSD), Remote storage (base tables).
  • Micro-partitions: immutable columnar files 50–500MB uncompressed; pruning is driven by min/max metadata per column.
  • CLUSTER BY co-locates rows with similar values; background reclustering runs as a serverless feature.
  • Use CLUSTERING_INFORMATION to check depth and decide if clustering is warranted.
  • Bytes spilled to local disk: 2–5× slowdown. Bytes spilled to remote storage: 10–50× slowdown.
  • Search Optimization Service (SOS): point lookups and equality filters on high-cardinality columns; not a replacement for clustering.
  • QUERY_HISTORY + Query Profile (plan tree) + EXPLAIN are the three diagnostic tools.

Pruning, Clustering, and Micro-Partitions

Q: Explain Snowflake micro-partition architecture and its performance implications.

Snowflake stores data in immutable columnar files (micro-partitions), 50–500MB uncompressed. Each partition stores per-column metadata: min, max, distinct count, null count. At query execution, Snowflake checks the WHERE clause against metadata and skips partitions where no rows can match — partition pruning. Tables loaded in chronological order naturally prune well on date columns. Randomly ordered tables prune poorly and benefit from CLUSTER BY.

Q: How do you choose a clustering key and decide when clustering is warranted?

Choose: the columns most frequently used in WHERE and JOIN predicates, low-to-medium cardinality first, maximum 3–4 columns. Apply when: table > 1TB, partition scan > 50% of total partitions on common queries, CLUSTERING_INFORMATION shows depth > 5–10. Skip when: table < 100GB, query patterns are too varied, table has high write churn (reclustering runs constantly). Check cost in AUTOMATIC_CLUSTERING_HISTORY — should be < savings in query credits.

Q: A query scans 500GB but returns 100 rows with 0% pruning. How do you fix it?

Diagnosis: run CLUSTERING_INFORMATION — if depth is high on the filter column, add CLUSTER BY. If filter column is high-cardinality (user_id, order_id), CLUSTER BY is ineffective — use Search Optimization Service for point lookups instead. For join-heavy queries: cluster both tables on the join key. Quick fix: materialize a pre-filtered subset as a materialized view or intermediate table. Check if a LIMIT can be added if this is exploratory.

Q: Explain spilling to local disk vs remote storage and how to fix it.

When intermediate results exceed warehouse memory: Snowflake first spills to local SSD (2–5× slower), then to remote cloud storage (10–50× slower). Identify via Query Profile — "Bytes Spilled to Local Storage" and "Bytes Spilled to Remote Storage" nodes. Fix strategies: scale up one warehouse size (doubles memory per node), reduce result set via earlier filtering, add CLUSTER BY on join keys to reduce join fanout, use LIMIT on sorts. Remote spill > 0 means severely memory-constrained — always scale up first.

Q: When does Search Optimization Service outperform CLUSTER BY?

SOS builds inverted indexes on specific columns for point lookups (WHERE id = X) and substring searches. Outperforms clustering for: high-cardinality columns (user IDs, order IDs) where clustering has low effectiveness, ad-hoc lookups on varied predicates, and equality/IN filters. Clustering wins for: range scans (date ranges), sequential reads, sort-merge joins. SOS is per-column serverless billing — analyze QUERY_HISTORY to confirm query patterns before enabling.

Caching, Profiling, and Advanced Optimization

Q: Describe Snowflake's three cache layers and how to maximize cache benefit.

Layer 1 — Result cache: identical query + unchanged underlying data = free instant result, 24hr TTL. Maximize: ensure downstream queries are exact text matches. Layer 2 — Local disk cache (warehouse SSD): recently scanned micro-partitions cached in SSD. Warm cache = faster repeated queries on same warehouse. Maximize: avoid suspending the warehouse between similar workloads (BI dashboards). Layer 3 — Remote storage: base data in S3/Azure/GCS. Always slowest. Minimize hits by combining result cache + partition pruning.

Q: How do you use Query Profile to diagnose a slow query?

Open Query Profile in Snowsight (or SYSTEM$EXPLAIN_PLAN_JSON). Key nodes to check: TableScan with "Partitions scanned >> Partitions total" → pruning issue. Sort with "Bytes Spilled" → memory pressure. HashJoin with large "Bytes passed by filter = 0" → no pruning on join key. "Compilation" node dominating time → complex plan, check for view chains or deeply nested CTEs. Always profile on the same warehouse size where production runs.

Q: How do materialized views affect query performance and what are their limitations?

Materialized views (MVs) pre-compute and store results; Snowflake auto-maintains them when the base table changes via background serverless refresh. Performance benefit: queries that match the MV definition are transparently rewritten to scan the MV instead. Limitations: MVs cannot include joins (single-table only), GROUP BY is limited, no HAVING clause, no external tables. Cost: serverless refresh credits. Best for: aggregations over large append-only tables, frequently queried summaries with predictable patterns.

Query Performance Interview Prep Checklist

Frequently Asked Questions

What is the first thing you check when a Snowflake query suddenly slows down?

Check Query Profile for the slow run vs a previous fast run. Compare: partitions scanned (did a table grow and break pruning?), bytes spilled (did data volume increase past memory capacity?), compilation time (did a view chain get more complex?). Then check QUERY_HISTORY for the prior 7 days to see if this is a gradual trend or sudden spike. Common causes: table growth breaking clustering depth, a code change adding a new join, or a warehouse size reduction.

How does Snowflake handle query concurrency and what are the performance implications?

Each warehouse has a concurrency limit based on size. When concurrency is exceeded, queries queue. Performance implications: queuing adds latency to all queries, not just the heavy ones. Fix options: (1) Multi-cluster warehouse to scale horizontally for concurrent workloads. (2) Separate warehouses for different workload types (ETL vs BI vs ad-hoc). (3) Use query tags and resource monitors to identify the workloads filling the queue. (4) Scale up the warehouse if individual queries are slow, not just queued.

When should you use a larger warehouse vs a smaller warehouse for cost efficiency?

Larger warehouse is more cost-efficient when: the query spills (extra runtime cost of spilling exceeds the size upgrade cost), the job has a fixed window (XL finishes in 30s vs M in 4min — same or lower total credits), or when query compilation is the bottleneck. Smaller warehouse is more efficient when: queries are fast and do not spill, workloads are sequential (parallelism does not help), or the warehouse is often idle between queries. The breakeven: if doubling size halves the runtime, cost is equal — go larger if spilling is involved.

Related Cheat Sheets

Top 30 Snowflake Interview Questions & AnswersSnowflake Cost Optimization — Expert Interview QuestionsSnowflake Semi-Structured Data — Expert Interview Questions
← All Cheat Sheets