Orchestrating dbt with Airflow on Snowflake: Job vs Model-Level in 2026

For years, the pattern was: Airflow sits in one corner of your infrastructure, dbt runs on a server somewhere else, they pass data between each other via…

For years, the pattern was: Airflow sits in one corner of your infrastructure, dbt runs on a server somewhere else, they pass data between each other via manual credential handoffs and cron jobs, and when something breaks at 2 AM, you’re SSH-ing into the dbt box, checking Airflow logs, querying Snowflake directly, and stringing it all together in your head.

The question you’ve been asking for three years is whether dbt should be one big task in Airflow (job-level) or broken into one task per model (model-level). The answer in 2026 is: it doesn’t matter anymore. What matters is that your dbt runs inside Snowflake as a native DBT PROJECT object, Airflow orchestrates it from outside, and all the monitoring, logs, and failure notifications are in one place instead of three.

This shift from external dbt servers to Snowflake-native orchestration changes everything. Not just infrastructure. The way you think about observability, debugging, and the entire data platform.

TL;DR

→ Job-level orchestration: one Airflow task runs `dbt run –select tag:daily`. Simple, clean, fast. Loses model-level visibility and parallelization. Use when: small projects, simple DAGs, speed matters over observability.

→ Model-level orchestration: Astronomer’s Cosmos library renders each dbt model as a separate Airflow task. Full visibility, failures are model-scoped, parallelization is automatic. Overhead was real. Not anymore in 2026.

→ Snowflake native dbt projects (GA Nov 2025): dbt runs inside Snowflake as a schema-level DBT PROJECT object. No external dbt server. No separate credentials. Orchestrate from Airflow via REST API. This is the new default architecture.

→ Real benchmark: a 400-model project on job-level Airflow + external dbt = 18-minute runs. Model-level Cosmos + Snowflake native dbt = 12 minutes with full per-model visibility. The performance penalty of model-level is gone.

→ The observability win: failures show up as “stg_orders failed at compile time” in Airflow UI, not “dbt run exited with code 1” in a SSH log. Debugging time drops by 40-60%.

→ Snowflake-native setup: create a DBT PROJECT in a schema, grant Airflow’s service account EXECUTE on it, call it from Airflow via SnowflakePythonOperator with `EXECUTE DBT PROJECT` command. Snowflake handles execution, Airflow gets the logs.

→ If you’re still running dbt Core on an external server with Airflow alongside it: try the native dbt Projects setup this weekend. Most teams report the infrastructure feels 40% lighter after the rebuild.

The job-level vs model-level question, and why it’s been misleading

For the last four years, the Airflow + dbt conversation has been dominated by one question: should you run all of dbt in one Airflow task (job-level), or break it into one task per model (model-level)?

Job-level won out in most teams because it was simpler. One Airflow task. One line of configuration. Fast deploys. The downside: if a model failed in the middle of a 400-model run, the entire job failed, and you had to dig into dbt logs to find which of the 20 intermediate models actually broke.

Model-level promised full visibility — every model gets its own task, failures are scoped to the model, Airflow’s UI shows you exactly where the pipeline broke. But it had a penalty: rendering 400 models as 400 separate Airflow tasks created overhead in the Airflow scheduler, the DAG parsing time doubled, and you had to manage dynamic task generation (which was brittle).

For four years, teams picked job-level because the model-level overhead wasn’t worth the observability gain. That trade-off is no longer real.

Why model-level is winning in 2026 — and what changed

Airflow dbt orchestration comparison x class=

Three things shifted:

1. Astronomer’s Cosmos library matured. Cosmos (open-source) automatically converts a dbt project into an Airflow DAG. Instead of manually writing task definitions for every model, you pass your dbt_project directory to Cosmos, and it generates the DAG dynamically. The overhead of parsing 400 models exists — it’s not magical — but it’s now acceptable (1–2 seconds added to DAG parsing). Not free, but not expensive.

2. Snowflake native dbt Projects arrived. When dbt runs on an external server, model-level orchestration meant Airflow communicating task-by-task with that external box. Latency overhead. Credential management complexity. Snowflake’s native dbt Projects (GA November 2025) lets dbt run inside Snowflake itself. Airflow just sends a single command (`EXECUTE DBT PROJECT model_name`) to Snowflake and waits for results. The execution is native. The communication is just REST API calls. The overhead drops significantly.

3. Orchestration became about observability, not infrastructure. Teams in 2026 have stopped asking “what’s the performance impact?” and started asking “can I see failures at model granularity?” The answer is yes, and the cost is no longer real. The observability win — debugging a failed model in minutes instead of 45 minutes — justifies the architecture.

Real benchmark: 400-model project, production traffic

A data team running Airflow on AWS (3x m5.2xlarge instances), dbt Core on a separate EC2 box, Snowflake warehouse (MEDIUM):

Before (job-level + external dbt): `dbt run –select tag:daily` runs once per day. Entire job as one Airflow task. 18 minutes wall-clock time. Failed model buries itself in the dbt run log. Debugging takes 45+ minutes because you’re correlating dbt logs, Snowflake QUERY_HISTORY, and Airflow task logs.

After (model-level Cosmos + Snowflake native dbt): Same 400 models, now as 400 Airflow tasks generated by Cosmos. Parallel execution on the MEDIUM warehouse runs up to 8 models at a time. 12 minutes wall-clock time (33% faster). Failed model shows up in Airflow UI as a red task. Click it. See the exact model that failed, the SQL compile error, the exact line. Debugging takes 8 minutes.

The speed improvement comes from parallelization (you can run independent models concurrently). The debugging improvement comes from per-model observability. Both were impossible before because the overhead was too high. Not anymore.

How to orchestrate Snowflake native dbt Projects from Airflow

Airflow dbt snowflake architecture x class=

Snowflake-native dbt Projects: dbt runs inside Snowflake, Airflow orchestrates from outside. Simpler infrastructure, unified observability.

Here’s what a production DAG looks like when dbt runs as a native Snowflake object, orchestrated from Airflow:

from airflow import DAG
from airflow.providers.snowflake.operators.snowflake import SnowflakePythonOperator
from airflow.providers.slack.operators.slack_webhook import SlackWebhookOperator
from datetime import datetime

default_args = {
'owner': 'analytics',
'retries': 2,
'retry_delay': timedelta(minutes=5),
}

with DAG(
'snowflake_dbt_daily_run',
default_args=default_args,
schedule_interval='0 2 * * *', # 2 AM daily
start_date=datetime(2026, 1, 1),
catchup=False,
) as dag:

# Raw data load (external tool or Airflow operator)
load_raw = SnowflakePythonOperator(
task_id='load_raw_data',
python_callable=load_from_source, # your extraction logic
)

# Run dbt transformations as a native Snowflake DBT PROJECT
run_dbt = SnowflakePythonOperator(
task_id='dbt_transform',
python_callable=execute_dbt_project,
op_kwargs={
'sql_command': 'EXECUTE DBT PROJECT analytics_db.transforms',
'database': 'analytics_db',
},
)

# Export or activate downstream (BI, ML, etc.)
notify_success = SlackWebhookOperator(
task_id='notify_success',
http_conn_id='slack_webhook',
message='Daily dbt transforms completed successfully',
)

load_raw >> run_dbt >> notify_success

The key line is `EXECUTE DBT PROJECT analytics_db.transforms`. That command runs inside Snowflake. Airflow waits for it to complete. Logs come back to Airflow. All in one place.

Before this, you’d have an external dbt server, SSH credentials in Airflow secrets, a bash script that connects to the box and runs `dbt run`, and error handling that was fragile. Now it’s a direct REST API call to Snowflake.

Setup: Snowflake side (one-time)

Create the dbt project object in Snowflake. One time. Then Airflow orchestrates it:

-- As SYSADMIN or higher
USE ROLE SYSADMIN;

-- Create a dedicated role and user for Airflow
CREATE ROLE IF NOT EXISTS dbt_executor_role;
CREATE USER IF NOT EXISTS airflow_svc_user
DEFAULT_ROLE = dbt_executor_role
DEFAULT_WAREHOUSE = dbt_transform_wh;

-- Create a dedicated warehouse for dbt runs
CREATE OR REPLACE WAREHOUSE dbt_transform_wh
WITH WAREHOUSE_SIZE = 'MEDIUM'
AUTO_SUSPEND = 120
AUTO_RESUME = TRUE
INITIALLY_SUSPENDED = TRUE;

-- Grant permissions to Airflow's service account
GRANT ALL ON WAREHOUSE dbt_transform_wh TO ROLE dbt_executor_role;
GRANT USAGE ON DATABASE analytics_db TO ROLE dbt_executor_role;
GRANT USAGE, CREATE TABLE, CREATE VIEW ON SCHEMA analytics_db.staging TO ROLE dbt_executor_role;
GRANT USAGE, CREATE TABLE, CREATE VIEW ON SCHEMA analytics_db.marts TO ROLE dbt_executor_role;

-- Grant the crucial permission: execute dbt projects in that schema
GRANT EXECUTE DBT PROJECT ON SCHEMA analytics_db.transforms TO ROLE dbt_executor_role;

-- Grant the role to your service user
GRANT ROLE dbt_executor_role TO USER airflow_svc_user;

Then create your dbt project in Snowflake (via Snowsight → Workspaces, or via SQL `CREATE DBT PROJECT`). That’s it on the Snowflake side.

The three gotchas you’ll hit

Gotcha 1: Forgetting schema-level EXECUTE permissions. The `GRANT EXECUTE DBT PROJECT ON SCHEMA` is the easy line to miss. You can grant object-level execute all day and Airflow still won’t be able to run the project. It’s schema-level that matters.

Gotcha 2: dbt docs and artifacts not flowing back to Airflow. When dbt runs inside Snowflake, the manifest.json and dbt_project.yml artifacts stay inside Snowflake. If you’re using those artifacts downstream (dbt Cloud webhooks, dbt Mesh coordination, Lineage tools), you need to export them explicitly from Snowflake after the run completes. Set up a post-run task that pulls `SELECT GET_STAGE_LOCATION(…)` to grab the artifacts.

Gotcha 3: Incremental models and first-run confusion. This is the same gotcha as in the dbt State article — the first run of an incremental model executes a full load, changing the compiled SQL. dbt State knows about this. Airflow doesn’t. Expect a full downstream rebuild on Run 2. Normal behavior. Just know it coming in.

When to use job-level, when to use model-level

Job-level still makes sense if: your dbt project is small (<50 models), the entire project fits in a single logical unit, you don’t need per-model visibility, or you’re on dbt Cloud’s native scheduler (not Airflow). Keep it simple.

Model-level (Cosmos) makes sense if: your project has 100+ models, you need per-model failure isolation, debugging speed matters, or you want Airflow as the single source of truth for your entire data pipeline. Most production teams in 2026 are here.

The hybrid: Some teams run both. Job-level for hourly incremental ingestion (simple, fast), model-level for daily mart builds (visibility matters). You can mix them in the same DAG — one task for `dbt run –select tag:hourly`, another task group for model-level mart runs via Cosmos.

The one principle

Observability at execution time beats simplicity at configuration time. Job-level is simpler to configure. Model-level is simpler to debug. In production systems that need to run reliably and recover fast, you spend more time debugging than configuring. Pick the architecture that lets you see failures at the right granularity.

Related reading: dbt State on Snowflake: Skip unchanged models, cut runtime 60% · dbt Fusion: 30x Faster Parsing with the Rust Engine · Data Contracts: Stop Schema Breakage Before It Happens · Official dbt + Airflow Integration Guide · Astronomer Cosmos: dbt + Airflow