Data Modeling

Data Modeling

The process of creating a visual representation of data structures and relationships, defining how data is stored, organized, and accessed in databases and warehouses.

Data modeling is the process of creating a visual representation of how data is structured, stored, and accessed in a system. It defines entities, attributes, relationships, and constraints—serving as the blueprint for databases and data warehouses.

Types of Data Models

1. Conceptual Model

High-level business view:
- Entities and relationships
- No technical details
- Understood by business stakeholders

2. Logical Model

Detailed structure without platform specifics:
- Attributes and data types
- Primary and foreign keys
- Normalization rules

3. Physical Model

Implementation-ready design:
- Table definitions with exact data types
- Indexes and partitions
- Platform-specific optimizations

Data Modeling Techniques

Dimensional Modeling (Kimball)

Optimized for analytics and BI:
- Fact Tables: Metrics and measurements
- Dimension Tables: Descriptive attributes
- Star Schema: Fact table surrounded by dimensions
- Snowflake Schema: Normalized dimensions

Data Vault

Scalable, auditable modeling:
- Hubs: Business keys
- Links: Relationships
- Satellites: Descriptive data with history

One Big Table (OBT)

Modern, denormalized approach:
- Single wide table with all data
- Optimized for cloud warehouses
- Simple, fast queries

Best Practices

1. Start with Business Questions: Model for how data will be used
2. Document Everything: Maintain a data dictionary
3. Think About Change: Design for schema evolution
4. Test with Real Queries: Validate performance early
5. Use dbt for Implementation: Transform raw to modeled data

Common Mistakes

- Over-normalization for analytical workloads
- Ignoring query performance during design
- Not involving business stakeholders
- Skipping documentation

Key Points

Blueprint for database and warehouse design
Three levels: Conceptual, Logical, Physical
Dimensional modeling (Kimball) popular for analytics
Data Vault provides scalability and auditability
Modern trend: simpler, denormalized "One Big Table"

Frequently Asked Questions

What is data modeling in simple terms?

Data modeling is designing how data is organized and stored. It defines what data exists (entities), what properties it has (attributes), and how pieces relate to each other (relationships).

What are the types of data models?

The three main types are: Conceptual (business view), Logical (detailed structure), and Physical (implementation-ready). Additionally, there are methodologies like Dimensional Modeling and Data Vault.

What is a star schema?

A star schema is a dimensional modeling design where a central fact table (containing metrics) is surrounded by dimension tables (containing descriptive attributes). It resembles a star shape and is optimized for BI queries.

What tools are used for data modeling?

Popular tools include dbt (for transformation modeling), Erwin, Lucidchart, dbdiagram.io, SqlDBM, and Draw.io. In modern stacks, dbt is often used to define and document models as code.

← Back to Glossary

Last updated: 2026-01-21

Published by

Sainath Reddy

Data Engineer at Anblicks

🎯 4+ years experience

About Me → LinkedIn