SQL Fundamentals
Database Concepts
Relational databases store data in structured tables connected through relationships. Understanding these core concepts is essential for effective database design.
- Tables β Organized collections of related data with rows and columns
- Keys β Primary and foreign keys that uniquely identify and link records
- Constraints β Rules that enforce data integrity and prevent invalid data
Good database design is the foundation of reliable applications.
What is a Database?
DfRelational Database
An organized collection of structured data stored in tables (relations) that are connected through defined relationships. Data is accessed and managed using SQL, with relationships enforced through keys and constraints.
Think of a database as a digital filing system β each table is a folder, each row is a document, and each column is a field on that document.
Database Architecture
Tables: The Foundation
DfTable
A two-dimensional data structure organized into rows (records) and columns (fields). Each column has a defined data type, and each row represents a single entity instance.
Table: employees
+----+-------------+-----------+--------+-------------+
| id | first_name | last_name | salary | department |
+----+-------------+-----------+--------+-------------+
| 1 | Alice | Johnson | 75000 | Engineering |
| 2 | Bob | Smith | 62000 | Marketing |
| 3 | Carol | Williams | 85000 | Engineering |
+----+-------------+-----------+--------+-------------+
| Term | Meaning | Example |
|---|---|---|
| Table | Collection of related data | employees |
| Row / Record | A single entry | Alice Johnson's data |
| Column / Field | A single attribute | salary |
| Cell | Intersection of row and column | 75000 |
Primary Keys
DfPrimary Key
A column or set of columns that uniquely identifies each row in a table. Every table should have a primary key to ensure each record can be uniquely referenced. Primary keys cannot contain NULL values.
-- Single column primary key (most common)
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
-- Auto-incrementing primary key
CREATE TABLE customers (
id INTEGER PRIMARY KEY AUTO_INCREMENT, -- MySQL
-- id SERIAL PRIMARY KEY, -- PostgreSQL
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
Primary Key Rules
| Rule | Description | Example |
|---|---|---|
| Uniqueness | Every value must be different | Two customers can't have id=1 |
| NOT NULL | Cannot be empty | Every row must have an id |
| Immutability | Should never change | Don't update primary keys |
| Simplicity | Use simple types | INTEGER or UUID preferred |
Foreign Keys
DfForeign Key
A column that creates a link between two tables by referencing the primary key of another table. Foreign keys enforce referential integrity, ensuring that relationships between tables remain consistent.
-- Create parent table first
CREATE TABLE customers (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
email TEXT UNIQUE NOT NULL
);
-- Create child table with foreign key
CREATE TABLE orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER,
order_date DATE DEFAULT CURRENT_DATE,
total DECIMAL(10,2),
FOREIGN KEY (customer_id) REFERENCES customers(id)
);
Foreign Key Actions
| Action | What Happens When Parent is Deleted | Use Case |
|---|---|---|
ON DELETE CASCADE | Child rows are also deleted | Order items when order deleted |
ON DELETE SET NULL | Foreign key is set to NULL | Optional relationships |
ON DELETE RESTRICT | Prevents deletion if children exist | Critical relationships |
-- Example with multiple foreign key actions
CREATE TABLE order_items (
id INTEGER PRIMARY KEY,
order_id INTEGER NOT NULL,
product_id INTEGER NOT NULL,
quantity INTEGER DEFAULT 1,
FOREIGN KEY (order_id) REFERENCES orders(id)
ON DELETE CASCADE
ON UPDATE CASCADE,
FOREIGN KEY (product_id) REFERENCES products(id)
ON DELETE RESTRICT
ON UPDATE CASCADE
);
Choose foreign key actions carefully. CASCADE can delete large amounts of data unintentionally. Always consider the business impact before setting ON DELETE CASCADE.
Entity Relationships
| Relationship | Description | Example |
|---|---|---|
| One-to-One | One record in A maps to one in B | User β Profile |
| One-to-Many | One record in A maps to many in B | Customer β Orders |
| Many-to-Many | Many records in A map to many in B | Students β Courses |
Constraints Deep Dive
DfConstraints
Rules applied to columns that enforce data integrity. Constraints ensure that data follows business rules and prevent invalid data from being inserted or modified.
CREATE TABLE products (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL,
sku VARCHAR(10) UNIQUE NOT NULL,
price DECIMAL(10,2) CHECK (price > 0),
stock INTEGER DEFAULT 0,
category TEXT NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT uq_product_category UNIQUE (name, category)
);
| Constraint | Description | Performance Impact |
|---|---|---|
NOT NULL | Column cannot have NULL value | Minimal |
UNIQUE | All values must be different | Creates index |
PRIMARY KEY | Unique identifier for each row | Creates clustered index |
FOREIGN KEY | Links to another table's primary key | Index recommended |
CHECK | Ensures values meet a condition | Evaluated on INSERT/UPDATE |
DEFAULT | Sets default value if none provided | No impact |
Indexes: Performance Optimization
DfIndex
A data structure that improves the speed of data retrieval operations. Indexes work similarly to a book's index β they allow the database to find data without scanning every row.
-- Create index on frequently queried columns
CREATE INDEX idx_customers_email ON customers(email);
-- Create composite index (for multi-column queries)
CREATE INDEX idx_orders_customer_date
ON orders(customer_id, order_date);
-- Unique index (enforces uniqueness + performance)
CREATE UNIQUE INDEX idx_customers_email_unique
ON customers(email);
Database Normalization
DfNormalization
The process of organizing data to reduce redundancy and improve data integrity. Normalization involves decomposing tables into smaller, well-structured tables linked by relationships.
Normal Forms
| Normal Form | Rule | Purpose |
|---|---|---|
| 1NF | Each column contains atomic values; no repeating groups | Eliminate redundancy |
| 2NF | Already in 1NF; no partial dependencies on composite PK | Eliminate partial redundancy |
| 3NF | Already in 2NF; no transitive dependencies | Eliminate indirect redundancy |
-- BAD: Violates 1NF (repeating group)
CREATE TABLE orders_bad (
id INTEGER PRIMARY KEY,
customer_name TEXT,
products TEXT -- "Laptop,Mouse,Keyboard" - multiple values!
);
-- GOOD: Follows 1NF (atomic values)
CREATE TABLE orders_good (
id INTEGER PRIMARY KEY,
customer_name TEXT
);
CREATE TABLE order_items (
order_id INTEGER,
product_name TEXT,
quantity INTEGER,
PRIMARY KEY (order_id, product_name)
);
Naming Conventions
Use consistent naming conventions throughout your database. This makes your schema easier to understand and maintain. Most teams use snake_case for table and column names.
-- GOOD naming conventions
CREATE TABLE customer_orders (
id INTEGER PRIMARY KEY,
customer_id INTEGER,
order_date TIMESTAMP,
total_amount DECIMAL(10,2)
);
-- BAD naming conventions
CREATE TABLE CustomerOrders ( -- CamelCase
ID INTEGER PRIMARY KEY, -- ALL CAPS
CustID INTEGER, -- Inconsistent abbreviations
OrderDate TIMESTAMP,
Total DECIMAL(10,2) -- Ambiguous name
);
Summary
Key Takeaways
- Databases store data in tables with rows and columns
- Primary keys uniquely identify each row; foreign keys link tables
- Constraints enforce data integrity and prevent invalid data
- Indexes dramatically improve query performance
- Normalization reduces redundancy and improves data quality