SQL Fundamentals
SQL Introduction
SQL (Structured Query Language) is the standard language for working with relational databases. It enables you to query, insert, update, and delete data efficiently.
- Query Data β Ask questions about your data and retrieve meaningful insights
- Manipulate Data β Insert, update, and delete records with precision
- Define Structures β Create and modify database tables, views, and indexes
SQL is the universal language of data β every major company relies on it.
What is SQL?
DfSQL (Structured Query Language)
A standardized programming language designed for managing and manipulating data held in relational database management systems (RDBMS). SQL became a standard of the American National Standards Institute (ANSI) in 1986.
Core Capabilities
| Operation | SQL Command | Description | Example |
|---|---|---|---|
| Query data | SELECT | Retrieve data from tables | SELECT * FROM users; |
| Insert data | INSERT INTO | Add new rows | INSERT INTO users (name) VALUES ('Alice'); |
| Update data | UPDATE | Modify existing rows | UPDATE users SET name = 'Bob' WHERE id = 1; |
| Delete data | DELETE | Remove rows | DELETE FROM users WHERE id = 1; |
| Create tables | CREATE TABLE | Define new tables | CREATE TABLE users (id INT PRIMARY KEY); |
| Modify tables | ALTER TABLE | Change table structure | ALTER TABLE users ADD email TEXT; |
| Remove tables | DROP TABLE | Delete entire tables | DROP TABLE users; |
-- A complete SQL query example
SELECT first_name, last_name, email
FROM customers
WHERE city = 'New York'
ORDER BY last_name;
SQL is declarative β you describe what data you want, not how to retrieve it. The database engine figures out the best way to execute your query.
How SQL Works: The Architecture
The Relational Model
DfRelational Model
A data model based on first-order predicate logic, invented by Edgar F. Codd in 1970. Data is organized into relations (tables) where each row represents a fact and each column represents an attribute of that fact.
| Concept | Description | SQL Equivalent |
|---|---|---|
| Relation | A two-dimensional table | TABLE |
| Tuple | A row in a relation | ROW / RECORD |
| Attribute | A column in a relation | COLUMN / FIELD |
| Domain | Allowed values for an attribute | DATA TYPE |
| Cardinality | Number of rows in a table | COUNT(*) |
| Degree | Number of columns in a table | Number of columns |
Set Theory Operations
SQL operations are based on mathematical set theory:
-- UNION: Combines two sets (removes duplicates)
SELECT city FROM customers
UNION
SELECT city FROM suppliers;
-- INTERSECT: Returns common elements in both sets
SELECT city FROM customers
INTERSECT
SELECT city FROM suppliers;
-- EXCEPT: Returns elements in first set but not second
SELECT city FROM customers
EXCEPT
SELECT city FROM suppliers;
Types of SQL Statements
DfSQL Statement Categories
SQL statements are grouped into four main categories based on their purpose: DDL (Data Definition Language), DML (Data Manipulation Language), DCL (Data Control Language), and TCL (Transaction Control Language).
| Category | Full Name | Commands | Purpose |
|---|---|---|---|
| DDL | Data Definition Language | CREATE, ALTER, DROP, TRUNCATE | Define database structure |
| DML | Data Manipulation Language | SELECT, INSERT, UPDATE, DELETE | Manipulate data |
| DCL | Data Control Language | GRANT, REVOKE | Control access permissions |
| TCL | Transaction Control Language | COMMIT, ROLLBACK, SAVEPOINT | Manage transactions |
DDL Example
-- CREATE: Make a new table
CREATE TABLE employees (
id INTEGER PRIMARY KEY,
name TEXT NOT NULL
);
-- ALTER: Add a column to existing table
ALTER TABLE employees ADD COLUMN email TEXT;
-- DROP: Remove a table completely
DROP TABLE employees;
DML Example
-- SELECT: Retrieve data
SELECT * FROM employees WHERE salary > 50000;
-- INSERT: Add new data
INSERT INTO employees (id, name, email)
VALUES (1, 'Alice', 'alice@company.com');
-- UPDATE: Modify existing data
UPDATE employees SET salary = 75000 WHERE id = 1;
-- DELETE: Remove data
DELETE FROM employees WHERE id = 1;
Query Execution Flow
DfQuery Execution
When you submit a SQL query, the database engine processes it through several stages: parsing (syntax check), validation (semantic check), optimization (execution plan), and finally execution (retrieving results).
SQL Dialects
Different databases use slightly different SQL dialects. The fundamentals are the same, but specific functions and syntax may vary between MySQL, PostgreSQL, SQL Server, and SQLite.
| Dialect | Key Differences | Best For |
|---|---|---|
| MySQL | AUTO_INCREMENT, LIMIT syntax | Web applications |
| PostgreSQL | JSON support, CTEs, Window functions | Complex queries |
| SQL Server | TOP syntax, IDENTITY columns | Enterprise systems |
| SQLite | File-based, minimal config | Mobile & embedded apps |
Who Uses SQL?
| Role | How They Use SQL | Common Tasks |
|---|---|---|
| Data Analysts | Query data for insights and reports | Ad-hoc queries, dashboards |
| Data Scientists | Extract data for analysis and modeling | Feature engineering, pipelines |
| Software Engineers | Build data-driven applications | CRUD operations, APIs |
| Database Administrators | Manage and optimize databases | Performance tuning, backups |
| Business Analysts | Generate business intelligence reports | KPI tracking, trend analysis |
Performance Tips
- Always use specific column names instead of
SELECT * - Add indexes on columns used in WHERE clauses
- Limit result sets with LIMIT/TOP when possible
- Avoid SELECT DISTINCT by using proper WHERE conditions
- Use EXPLAIN/EXPLAIN ANALYZE to understand query plans
Summary
Key Takeaways
- SQL is the standard language for relational database management
- SQL can query, insert, update, delete, and define data structures
- SQL is declarative β you describe what you want, not how to get it
- SQL statements are categorized into DDL, DML, DCL, and TCL
- Practice is the best way to master SQL fundamentals