Snowflake Cortex AI matured significantly between 2023-2026, expanding from simple LLM functions to a comprehensive AI platform with AISQL, Cortex Search, Cortex Analyst, Document AI, and Agents. As adoption accelerates, controlling costs becomes critical—not because Cortex is expensive, but because its pricing model differs fundamentally from traditional Snowflake compute.

This guide breaks down exactly how Snowflake charges for Cortex, compares pricing models, provides real cost scenarios, and shares optimization strategies based on 2026 current rates.


What is Snowflake Cortex AI? (2026 Overview)

Snowflake Cortex AI is a suite of integrated generative AI Cortex AI capabilities built directly into Snowflake. Instead of exporting data to external APIs, you can invoke LLM functions, embeddings, search, and agents directly in SQL—keeping data within Snowflake’s security perimeter while dramatically reducing latency and complexity.

The key difference from traditional Snowflake compute: Cortex charges on token consumption, not compute credits.


How Does Snowflake Cortex Charge You? (2026 Pricing Model)

Token-Based Pricing Fundamentals

Snowflake Cortex uses token-based billing for most services. A token represents approximately:

  • 4 characters of text
  • 0.75 words
  • Therefore: 1,000-word document ≈ 1,300-1,500 tokens

Pricing structure:

  • Input tokens: Charged when you send text to the model
  • Output tokens: Charged for model-generated responses
  • Rates vary by model: Small models cost less; large models cost more

Conversion to dollars:

  • Token cost converts to Snowflake credits
  • 1 credit = $3-4 depending on contract terms
  • Small model: ~0.0001-0.0005 credits/token
  • Mid-tier model: ~0.0005-0.002 credits/token
  • Large model: ~0.003-0.01+ credits/token

AISQL Functions: The Core Cortex Services

AISQL functions let you call AI models directly in SQL. These are the most commonly used Cortex features.

What Are the Available AISQL Functions?

Available functions include AI_COMPLETE, AI_CLASSIFY, AI_FILTER, AI_AGG, AI_EMBED, AI_EXTRACT, AI_SENTIMENT, AI_SIMILARITY, AI_TRANSCRIBE, AI_PARSE_DOCUMENT, AI_REDACT, and AI_TRANSLATE.


AI_SENTIMENT: Analyzing Emotional Tone

How Does AI_SENTIMENT Work?

AI_SENTIMENT analyzes text and returns sentiment classification.

Real SQL example:

sql

SELECT 
  review_id,
  review_text,
  SNOWFLAKE.CORTEX.AI_SENTIMENT(review_text) as sentiment_score
FROM product_reviews
WHERE review_date >= CURRENT_DATE - 30;

Cost profile:

  • Input tokens: Review text (avg 120 tokens)
  • Output tokens: Sentiment value (2-3 tokens)
  • Total per row: ~125 tokens

Cost by volume (using Llama 3.1 8B, smallest model):

VolumeMonthly Cost
10,000 reviews~$0.30
100,000 reviews~$3.00
1,000,000 reviews~$30.00

Why sentiment is cost-efficient: High input-to-output ratio. You send large amounts of text but receive minimal response.


AI_EXTRACT: Pulling Structured Data

What Does AI_EXTRACT Do?

Extracts specific structured information from unstructured text.

Real SQL example:

sql

SELECT 
  ticket_id,
  email_body,
  SNOWFLAKE.CORTEX.AI_EXTRACT(
    email_body,
    'Extract customer issue, resolution requested, and priority level'
  ) as extracted_fields
FROM support_tickets
WHERE status = 'unresolved';

Cost profile:

  • Input tokens: Unstructured text (avg 350 tokens)
  • Output tokens: Extracted data (50-100 tokens)
  • Total per call: ~425 tokens

Cost by volume (using Snowflake Arctic, mid-tier):

VolumeMonthly Cost
1,000 extractions~$0.51
10,000 extractions~$5.10
100,000 extractions~$51.00

Key insight: Extraction provides excellent token efficiency—you’re converting unstructured data into structured format without massive output expansion.


AI_COMPLETE: General Text Generation

When Do You Use AI_COMPLETE?

Generates new text based on prompts—the most expensive function due to output token generation.

Real SQL example:

sql

SELECT 
  review_id,
  SNOWFLAKE.CORTEX.AI_COMPLETE(
    'mistral-large',
    'Write a 2-sentence response to this customer feedback: ' || feedback_text
  ) as generated_response
FROM customer_feedback
WHERE rating < 3;

Cost profile:

  • Input tokens: Prompt + context (avg 180 tokens)
  • Output tokens: Generated text (varies by request, 30-150 tokens)
  • Total per call: ~210-330 tokens

Cost by output length (using Mistral Large, premium model):

Output LengthPer Call10,000 Calls/Month
30 tokens (2 sentences)$0.0015$15.00
100 tokens (1 paragraph)$0.0034$34.00
250 tokens (1 page)$0.0081$81.00

Critical factor: Output length directly multiplies costs. Requesting brief, specific responses is essential.


AI_CLASSIFY: Multi-Label Text Classification

How Does AI_CLASSIFY Work?

Categorizes text into predefined classes.

Real SQL example:

sql

SELECT 
  ticket_id,
  description,
  SNOWFLAKE.CORTEX.AI_CLASSIFY(
    description,
    'Classify as: billing, technical, account, refund, or other'
  ) as category
FROM support_tickets;

Cost profile:

  • Input tokens: Text content (avg 200 tokens)
  • Output tokens: Category label (1-5 tokens)
  • Total per call: ~205 tokens

Cost by volume (using Llama 3.1 8B):

VolumeMonthly Cost
10,000 classifications~$0.61
100,000 classifications~$6.10

Why it’s cheap: Classification is low-computation with minimal output.


What are Embeddings Used For?

Creates numerical vector representations for semantic similarity and retrieval-augmented generation (RAG).

Real SQL example:

sql

SELECT 
  doc_id,
  SNOWFLAKE.CORTEX.AI_EMBED(
    'snowflake-arctic-embed-m-v2',
    document_text
  ) as embedding_vector
FROM documents;

Cost profile:

  • Input tokens: Document text (charged once per document)
  • Output: Vector representation (no charge)
  • Total: Input tokens only

Cost by volume (model-dependent, ~0.05 credits/million tokens):

VolumeDocument AvgMonthly Cost
1,000 docs500 tokens~$0.08
10,000 docs1,000 tokens~$1.50
100,000 docs2,000 tokens~$30.00

Important: Embeddings are one-time cost per document. Reusing embeddings for multiple searches eliminates re-embedding charges.


AI_TRANSLATE: Language Translation

How Does AI_TRANSLATE Perform?

Translates text between languages while preserving meaning.

Real SQL example:

sql

SELECT 
  message_id,
  original_message,
  SNOWFLAKE.CORTEX.AI_TRANSLATE(
    original_message,
    'es'  -- Spanish
  ) as translated_message
FROM user_messages
WHERE language_code = 'en';

Cost profile:

  • Input tokens: Original message (avg 80 tokens)
  • Output tokens: Translated text (similar length, ~80 tokens)
  • Total per call: ~160 tokens

Cost by volume (using Llama 3.1 8B):

VolumeMonthly Cost
10,000 translations~$0.48
100,000 translations~$4.80
1,000,000 translations~$48.00

Why translation is efficient: Input-to-output ratio is 1:1. You’re not generating new content, just converting existing content.


How Does Cortex Search Pricing Work?

Cortex Search has a different cost structure than AISQL functions.

Cost components:

  1. Embedding/Indexing:
    • One-time cost to create search index
    • Example: 10M rows × 500 tokens × 0.05 credits/million = 250 credits (~$750)
  2. Serving Cost (Ongoing):
    • Per GB of index maintained
    • Example: 50GB index × 6.3 credits/GB/month = 315 credits (~$945/month)
  3. Storage:
    • Standard Snowflake rates (~$23/TB/month)
    • Example: 50GB = $1.15/month

Total monthly cost example:

  • Initial setup: $750 (one-time)
  • Ongoing monthly: $946
  • Annual: ~$11,352

When Cortex Search makes sense: Large document collections where semantic search provides business value justifying the cost.


Cortex Analyst: Natural Language to SQL

How is Cortex Analyst Priced?

Fixed cost per natural language question.

Pricing:

  • 6.7 credits per 100 messages
  • 1 message = 1 natural language question
  • Only successful responses charged (HTTP 200)

Cost examples:

QuestionsMonthly Cost
100$20
1,000$201
10,000$2,010

Key point: Message cost is fixed; underlying SQL query execution charges additional compute credits based on warehouse complexity.


Real-World Cost Scenarios (2026)

Scenario 1: E-Commerce Sentiment Analysis

Setup: 200,000 product reviews/month

sql

SELECT 
  review_id,
  SNOWFLAKE.CORTEX.AI_SENTIMENT(review_text, 'llama2-70b-chat') as sentiment,
  SNOWFLAKE.CORTEX.AI_EXTRACT(review_text, 'Extract main product issue') as issue
FROM reviews;

Cost breakdown:

  • Sentiment: 200,000 × 120 tokens × Llama rate = $6.00/month
  • Extraction: 50,000 × 300 tokens × Arctic rate = $5.10/month
  • Total: $11.10/month ($133/year)

Compared to alternatives:

  • Third-party sentiment API: $500-1,000/month
  • Internal ML infrastructure: $5,000-15,000/month
  • Cortex advantage: 98%+ cost savings

Scenario 2: Support Ticket Automation

Setup: 5,000 tickets/month

sql

SELECT 
  ticket_id,
  SNOWFLAKE.CORTEX.AI_CLASSIFY(description, 'category') as category,
  SNOWFLAKE.CORTEX.AI_EXTRACT(description, 'Extract issue and resolution') as details,
  SNOWFLAKE.CORTEX.AI_COMPLETE('mistral-large', 'Draft response: ' || description, {}) as response
FROM tickets;

Cost breakdown:

FunctionVolumeTokens/CallModelCost
Classification5,000150Llama$0.23
Extraction5,000300Arctic$0.90
Response Gen2,500200Mistral$1.80
Total$2.93/month

Annual cost: $35.16


Scenario 3: Document Processing

Setup: 500 PDFs/month (avg 3,000 tokens each)

sql

SELECT 
  doc_id,
  SNOWFLAKE.CORTEX.AI_PARSE_DOCUMENT(@stage, 'LAYOUT') as parsed_content
FROM documents;

Cost breakdown:

  • 500 docs × 3,000 tokens × Arctic rate (~0.0012 credits/token) = $1.80/month
  • Annual cost: $21.60

FAQ: Answering Common Cost Questions

What’s the difference between AISQL and Cortex Search costs?

AISQL functions charge per token processed (input + output), while Cortex Search charges for embedding tokens during creation and ongoing serving costs per GB of index maintained. AISQL is cheaper for casual use; Cortex Search makes sense for high-volume semantic search.


Which model should I choose to minimize costs?

Model choice is your biggest cost lever (10x variation possible):

Use Llama 3.1 8B for:

  • Sentiment analysis
  • Basic classification
  • Simple extraction
  • Any routine task

Cost: 80% cheaper than premium models Quality: Excellent for classification/routine tasks

Use Arctic for:

  • Complex extractions
  • Entity recognition
  • Moderate-complexity analysis
  • Conversational responses

Cost: 60% cheaper than premium Quality: Excellent overall performance

Use premium (GPT-4, Claude Opus) only for:

  • Complex reasoning
  • Code generation
  • Nuanced analysis requiring explanations
  • Real-time conversational systems

Example: Sentiment analysis works equally well with Llama ($3/month for 100k reviews) vs. Claude ($60/month for same work). Same business outcome, 20x cost difference.


How do I estimate costs before processing large volumes?

Step-by-step approach:

  1. Sample your data:

sql

SELECT 
  SNOWFLAKE.CORTEX.COUNT_TOKENS(your_column) as token_count
FROM your_table
LIMIT 1000;
  1. Calculate average tokens:

sql

SELECT 
  AVG(token_count) as avg_tokens,
  COUNT(*) as sample_size
FROM (
  SELECT SNOWFLAKE.CORTEX.COUNT_TOKENS(your_column) as token_count
  FROM your_table
  LIMIT 1000
);
  1. Estimate total cost:
Total tokens = estimated_rows × avg_tokens_per_row
Cost = (Total tokens / 1,000,000) × credits_per_million × price_per_credit

Can I monitor Cortex spending in real-time?

Yes, using official Snowflake views:

Snowflake provides the CORTEX_FUNCTIONS_USAGE_HISTORY view for aggregated hourly usage data that groups token and credit consumption by function, model, and hour.

sql

SELECT 
  DATE_TRUNC('day', START_TIME) as day,
  FUNCTION_NAME,
  MODEL_NAME,
  SUM(TOKENS_USED) as total_tokens,
  SUM(CREDITS_USED) as total_credits,
  ROUND(SUM(CREDITS_USED) * 3.5, 2) as estimated_cost
FROM SNOWFLAKE.ACCOUNT_USAGE.CORTEX_FUNCTIONS_USAGE_HISTORY
WHERE START_TIME >= CURRENT_DATE - 30
GROUP BY DATE_TRUNC('day', START_TIME), FUNCTION_NAME, MODEL_NAME
ORDER BY day DESC;

Is Cortex cheaper than OpenAI API?

Yes, significantly:

ProviderInput CostOutput CostAdvantage
OpenAI GPT-4$0.03/1K tokens$0.06/1K tokensBaseline
Mistral Large (via API)$0.003/1K tokens$0.009/1K tokens10x cheaper
Snowflake Arctic$0.0012/1K tokens$0.0036/1K tokens25x cheaper
Snowflake Llama 3.1$0.0005/1K tokens$0.0015/1K tokens40x cheaper

Plus: No separate API authentication, no data exfiltration, no rate limiting concerns.


When NOT to Use Cortex Functions

Avoid Cortex for String Matching

sql

-- DON'T DO THIS (costs money)
SELECT SNOWFLAKE.CORTEX.AI_CLASSIFY(
  email_body,
  'Does this contain "refund"? Yes or No'
)

-- DO THIS (free)
SELECT CASE 
  WHEN email_body ILIKE '%refund%' THEN 'Yes' 
  ELSE 'No' 
END;

Avoid Cortex for Structured Lookups

sql

-- DON'T DO THIS (costs money)
SELECT SNOWFLAKE.CORTEX.AI_COMPLETE(
  'mistral-large',
  'What is customer name for ID 12345?'
);

-- DO THIS (free)
SELECT name FROM customers WHERE id = 12345;

Avoid Cortex for Deterministic Operations

sql

-- DON'T DO THIS (costs money)
SELECT SNOWFLAKE.CORTEX.AI_COMPLETE(
  'mistral-large',
  'Convert 01/15/2026 from MM/DD/YYYY to YYYY-MM-DD'
);

-- DO THIS (free)
SELECT TO_DATE('01/15/2026', 'MM/DD/YYYY');

Cost Optimization Best Practices

Optimization 1: Model Selection by Task

Choose the smallest model that works:

sql

-- BEFORE: Sentiment with premium model
SELECT SNOWFLAKE.CORTEX.AI_SENTIMENT(
  review_text, 
  'claude-opus'  -- Most expensive
) as sentiment;

-- AFTER: Sentiment with budget model
SELECT SNOWFLAKE.CORTEX.AI_SENTIMENT(
  review_text, 
  'llama2-70b-chat'  -- Cheapest, 90% as accurate
) as sentiment;

Result: 80% cost reduction for identical accuracy on classification tasks.


Optimization 2: Aggressive Caching

Don’t recompute results:

sql

CREATE OR REPLACE DYNAMIC TABLE cached_sentiments AS
SELECT 
  review_id,
  SNOWFLAKE.CORTEX.AI_SENTIMENT(review_text) as sentiment,
  CURRENT_TIMESTAMP as processed_at
FROM product_reviews
WHERE created_date >= CURRENT_DATE - 30;

-- Query cache instead of recomputing
SELECT * FROM cached_sentiments
WHERE sentiment < -0.5;

Result: 95%+ cost reduction for repeated queries.


Optimization 3: Output Length Constraints

sql

-- BEFORE: Vague request (long output)
SELECT SNOWFLAKE.CORTEX.AI_COMPLETE(
  'mistral-large',
  'Summarize this: ' || document_text
);
-- Average output: 300 tokens

-- AFTER: Specific constraint (short output)
SELECT SNOWFLAKE.CORTEX.AI_COMPLETE(
  'mistral-large',
  'Summarize in exactly 3 bullet points: ' || document_text
);
-- Average output: 50 tokens

Result: 80-85% reduction in output tokens.


Optimization 4: Batch Processing

sql

-- Process all at once (low overhead)
CREATE TASK process_batch_daily
WAREHOUSE = compute_wh
SCHEDULE = 'USING CRON 0 2 * * * UTC'
AS
SELECT SNOWFLAKE.CORTEX.AI_SENTIMENT(text)
FROM data_queue
WHERE processed = false;

Result: 15-20% reduction in compute overhead.


Optimization 5: Input Data Cleaning

sql

-- Clean data before processing
CREATE FUNCTION clean_text(raw_text VARCHAR)
RETURNS VARCHAR
AS
$$
  SELECT REGEXP_REPLACE(
    REGEXP_REPLACE(raw_text, '(\[.*?\])', ''),  -- Remove metadata
    '\n\n+', ' '  -- Collapse newlines
  )
$$;

-- Process clean data only
SELECT SNOWFLAKE.CORTEX.AI_SENTIMENT(clean_text(messy_input))
FROM raw_data;

Result: 30-50% reduction in input tokens.


Key Takeaways

  1. Cortex charges per token, not per creditUnderstanding token consumption is critical
  2. Model selection is the biggest cost lever – 10-40x cost variation possible
  3. AISQL functions are affordable – Most use cases cost $10-100/month
  4. Cortex Search is expensive – Only use if semantic search is core business need
  5. Monitoring is essential – Use CORTEX_FUNCTIONS_USAGE_HISTORY to track spend
  6. Optimization opportunities exist – Caching, batching, model selection dramatically reduce costs
  7. Not all tasks need Cortex – Use SQL/regex for deterministic operations
  8. Cortex is 10-40x cheaper than alternatives – Exceptional ROI compared to third-party APIs

External References (Official Snowflake Docs)


Next Steps

For developers starting with Cortex:

  1. Run a small pilot with 1% of target data
  2. Test multiple models to find optimal cost/quality balance
  3. Establish baseline usage metrics using CORTEX_FUNCTIONS_USAGE_HISTORY
  4. Implement caching for repeated operations
  5. Set up daily cost monitoring before scaling to production

Disclaimer: Pricing current as of January 2026. Rates subject to change. Always verify with official Snowflake documentation for most current pricing.