Introduction to Database Design
Good database design is the foundation of efficient, maintainable applications. Proper design ensures data integrity, optimizes performance, and simplifies application development. Poor design leads to data inconsistencies, performance problems, and difficult maintenance.
This guide covers relational database design principles, normalization, entity relationships, data modeling techniques, and when to denormalize for performance. For database fundamentals, see our Databases guide.
Data Modeling Process
1. Requirements Analysis
Understand business requirements, identify entities (things of interest), attributes (properties), and relationships. Document data flow, access patterns, and constraints.
2. Conceptual Design
Create Entity-Relationship Diagrams (ERDs) showing entities, attributes, and relationships. Focus on what data you need, not implementation details.
3. Logical Design
Transform conceptual model into relational schema. Define tables, columns, data types, primary keys, foreign keys, and constraints. Normalize the design.
4. Physical Design
Optimize for performance: add indexes, consider denormalization, partition tables if needed, and optimize storage. For indexing strategies, see our Database Indexing guide.
Normalization
Normalization reduces data redundancy and prevents anomalies. Normal forms (1NF, 2NF, 3NF, BCNF) progressively eliminate redundancy.
First Normal Form (1NF)
Each column contains atomic values. No repeating groups or arrays. Each row is unique.
Second Normal Form (2NF)
1NF plus all non-key attributes fully depend on the primary key. Eliminate partial dependencies.
Third Normal Form (3NF)
2NF plus no transitive dependencies. Non-key attributes don't depend on other non-key attributes.
Entity Relationships
Relationship Types
One-to-One (1:1)
Each entity instance relates to exactly one instance of another entity. Example: User and UserProfile.
One-to-Many (1:N)
One entity instance relates to many instances of another entity. Example: User and Orders.
-- One-to-Many relationship
CREATE TABLE users (
id PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE orders (
id PRIMARY KEY,
user_id REFERENCES users(id), -- Foreign key
total DECIMAL(10,2)
);
Many-to-Many (M:N)
Many instances relate to many instances. Requires a junction/join table. Example: Students and Courses.
-- Many-to-Many relationship
CREATE TABLE students (
id PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE courses (
id PRIMARY KEY,
title VARCHAR(100)
);
CREATE TABLE enrollments ( -- Junction table
student_id REFERENCES students(id),
course_id REFERENCES courses(id),
PRIMARY KEY (student_id, course_id)
);
Keys and Constraints
Primary Key
Uniquely identifies each row. Can be single column or composite. Should be immutable and minimal.
Foreign Key
References primary key in another table. Enforces referential integrity. Should be indexed for performance.
Unique Constraint
Ensures column values are unique. Can be on single or multiple columns (composite unique constraint).
Check Constraints
Enforce domain rules. Example: age >= 0, status IN ('active', 'inactive').
Denormalization
While normalization reduces redundancy, sometimes denormalization improves performance. Denormalize strategically when:
- Read performance is critical and writes are infrequent
- Joins are expensive and data changes rarely
- You need to optimize specific query patterns
- Data warehouse or reporting scenarios
Common denormalization techniques include storing computed values, duplicating frequently accessed columns, and creating summary tables. For microservices, see our APIs & Microservices guide on database-per-service patterns.
Design Best Practices
Naming Conventions
Use clear, consistent naming. Table names should be plural nouns (users, orders). Column names should be descriptive. Use consistent prefixes/suffixes for related tables.
Data Types
Choose appropriate data types. Use smallest sufficient type. Use DATE/TIME types, not strings. Use DECIMAL for money, not FLOAT. For database selection, see our Databases guide.
NULL Handling
Decide carefully which columns allow NULL. NULL means "unknown" not "empty". Use NOT NULL constraints where appropriate. Consider default values instead of NULL.
Audit Fields
Include created_at, updated_at, created_by, updated_by for auditing. Use database triggers or application logic to maintain these fields automatically.
Soft Deletes
Consider soft deletes (deleted_at flag) instead of hard deletes for important data. Allows recovery and audit trails. Filter deleted records in queries.
Design Patterns
Single Table Inheritance
Store related entities in one table with a type discriminator column. Good for similar entities with shared attributes.
Class Table Inheritance
Separate table for each entity type with shared base table. More normalized but requires joins.
Polymorphic Associations
One foreign key can reference multiple table types. Use sparingly - can complicate queries and constraints.