The Star Schema is the most widely used data modeling pattern in data warehousing. Named for its star-like shape, it places a central fact table (containing measurable events) surrounded by dimension tables (containing descriptive attributes).
Star Schema Structure
```
ββββββββββββ
β dim_date β
ββββββ¬ββββββ
β
ββββββββββββ β βββββββββββββββ
βdim_productββββ fact_sales ββββ€ dim_customer β
ββββββββββββ β βββββββββββββββ
β
ββββββ΄βββββββ
β dim_store β
βββββββββββββ
Key Components
Fact Tables
Store measurable, quantitative data (events that happened):
- fact_sales
: revenue, quantity, discount
- fact_orders: order_amount, shipping_cost
- fact_web_events: page_views, session_durationTypes of facts:
- Additive: Can be summed across all dimensions (revenue)
- Semi-additive: Can be summed across some dimensions (account balance)
- Non-additive: Cannot be summed (ratios, percentages)
Dimension Tables
Store descriptive, categorical data (context for facts):
- dim_customer: name, email, segment, region
- dim_product: name, category, brand, price_tier
- dim_date: date, month, quarter, year, is_holidayStar Schema Example
`sql
-- "Total revenue by product category and quarter"
SELECT
d.quarter,
p.category,
SUM(f.revenue) AS total_revenue
FROM fact_sales f
JOIN dim_date d ON f.date_key = d.date_key
JOIN dim_product p ON f.product_key = p.product_key
GROUP BY d.quarter, p.category
ORDER BY d.quarter, total_revenue DESC;
``Star Schema vs Snowflake Schema
| Aspect | Star Schema | Snowflake Schema |
|--------|------------|------------------|
| Dimension Tables | Denormalized (flat) | Normalized (split into sub-tables) |
| Query Performance | Faster (fewer joins) | Slower (more joins) |
| Storage | More (redundant data) | Less (normalized) |
| Complexity | Simple to understand | More complex relationships |
| Best For | BI/Analytics queries | Storage-constrained environments |
Best Practices
1. Use Surrogate Keys: Integer keys in facts, not natural keys
2. Denormalize Dimensions: Flatten hierarchies for query speed
3. Add a Date Dimension: Always include a dedicated date table
4. Keep Facts Narrow: Only keys and measures in fact tables
5. Use Conformed Dimensions: Share dimension tables across fact tables