🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Choosing the Right Database

Data SystemsDatabase Selection🟢 Free Lesson

Advertisement

Data Systems

Choosing the Right Database

Database selection is one of the most consequential architectural decisions. A wrong choice can limit your system for years. Learn a systematic approach to evaluating database technologies.

  • Trade-offs — Every database optimizes for specific access patterns
  • Requirements — Your data model and query patterns drive the choice
  • Evolution — Systems often use multiple database types (polyglot persistence)

There is no "best" database—only the best database for your specific requirements.

The Database Selection Framework

Use this decision framework to systematically evaluate database options:

DfDatabase Selection

Database selection is the process of choosing a data storage technology based on data model, access patterns, consistency requirements, scale constraints, and operational characteristics. The goal is to match the database's strengths to your system's specific needs, accepting the trade-offs that come with each choice.

Step 1: Understand Your Data Model

RelationalTables, rows, SQLACID transactionsJOINs, foreign keysDocumentJSON/BSON docsFlexible schemaDenormalizedKey-ValueSimple key→valueO(1) lookupsHigh throughputGraphNodes, edgesRelationship queriesTraversalsColumn-FamilyWide columnsTime-series optimizedWrite-heavy workloadsSearch EngineInverted indexFull-text searchRelevance scoringTime-SeriesTimestamps + valuesCompressionAggregation queries

Step 2: Map Requirements to Database Types

Requirement PatternRecommended Database
Complex queries with JOINsPostgreSQL, MySQL
Flexible document schemaMongoDB, CouchDB
Simple key-value lookupsRedis, DynamoDB
Relationship traversalNeo4j, Amazon Neptune
Full-text searchElasticsearch, Solr
Time-series dataInfluxDB, TimescaleDB
High write throughputCassandra, ScyllaDB
Strong consistencySpanner, CockroachDB
Global distributionDynamoDB Global Tables, Cosmos DB

Step 3: Evaluate the CAP Trade-offs

DfCAP Trade-off

Every distributed database makes trade-offs between Consistency, Availability, and Partition tolerance. Since network partitions are inevitable, the real choice is between CP (consistent but may be unavailable) and AP (available but may be inconsistent) systems.

Database CAP Classification

DBtype={CPif consistency criticalAPif availability criticalDB_{type} = \begin{cases} CP & \text{if consistency critical} \\ AP & \text{if availability critical} \end{cases}

Here,

  • DBtypeDB_{type}=Database category
  • CPCP=Consistent and Partition-tolerant (e.g., HBase, MongoDB)
  • APAP=Available and Partition-tolerant (e.g., Cassandra, DynamoDB)

Step 4: Consider Operational Characteristics

FactorQuestions to Ask
Team expertiseDoes the team know this database?
CommunityIs there good documentation and support?
Managed servicesCan we use a managed version (RDS, DynamoDB)?
CostWhat's the total cost of ownership (license, ops, storage)?
MigrationHow hard is it to migrate away if needed?

Polyglot persistence—using different databases for different use cases—is common in large systems. Don't feel you must choose one database for everything. The key is to justify why each database is the right choice for its specific use case.

The Decision Matrix

Use this matrix to compare database options:

Database Comparison for a Social Media App

CriteriaPostgreSQLMongoDBCassandraRedis
Complex queriesExcellentGoodPoorN/A
Write throughputGoodGoodExcellentExcellent
Schema flexibilityLowHighHighN/A
ConsistencyStrongConfigurableEventualConfigurable
Horizontal scalingHardEasyExcellentCluster mode
Relationship queriesExcellentLimitedN/AN/A

Recommendation:

  • User data, posts: PostgreSQL (relationships, complex queries)
  • Feed, activity logs: Cassandra (high write throughput, time-series)
  • Session data, counters: Redis (low latency, high throughput)
  • Search: Elasticsearch (full-text search, relevance)

When presenting database choices in an interview, explicitly state why you rejected the alternatives. This shows you considered multiple options and made an informed decision, not just a default choice.

Common Anti-Patterns

  1. One database for everything — Trying to force a single database to handle all access patterns
  2. Premature optimization — Choosing a complex database when a simple one suffices
  3. Ignoring operational complexity — Choosing a database the team can't operate
  4. Copy-paste architecture — Using what worked at a previous company without considering different requirements

Practice Exercises

  1. Database Selection: For a ride-sharing app (like Uber), choose the right database for each data type: user profiles, ride history, real-time location, payment transactions. Justify each choice.

  2. Trade-off Analysis: Compare PostgreSQL and MongoDB for an e-commerce product catalog. What are the trade-offs in terms of schema flexibility, query performance, and scalability?

  3. Polyglot Design: Design a social media system using at least 3 different databases. Explain why each database is the right choice for its specific use case.

  4. Migration Planning: Your team currently uses MySQL but needs to handle 10x more write throughput. Evaluate the options: scale vertically, add read replicas, shard, or migrate to Cassandra.

Key Takeaways:

  • Database selection starts with understanding your data model and access patterns
  • Map requirements to database types using the decision framework
  • Consider CAP trade-offs: CP for consistency, AP for availability
  • Polyglot persistence is common—use different databases for different use cases
  • Always justify why you rejected alternatives

What to Learn Next

-> SQL Deep Dive PostgreSQL, MySQL, indexing strategies, and query optimization.

-> NoSQL Deep Dive Document, key-value, column-family, and graph databases.

-> NewSQL and Distributed SQL Spanner, CockroachDB, and the next generation of SQL databases.

-> Databases SQL vs NoSQL, indexing, replication, and sharding fundamentals.

-> Database Indexing B-trees, hash indexes, and indexing strategies.

-> Data Partitioning Sharding strategies, consistent hashing, and partition keys.

Premium Content

Choosing the Right Database

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert System Design Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement