πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

DynamoDB Deep Dive

Data SystemsKey-Value Stores🟒 Free Lesson

Advertisement

Data Systems

DynamoDB Deep Dive

Amazon DynamoDB is a fully managed NoSQL database that delivers single-digit millisecond performance at any scale. Master its partitioning model, indexing strategies, global tables, and event-driven patterns with DynamoDB Streams.

  • Serverless β€” No servers to manage, auto-scaling built in
  • Predictable Performance β€” Single-digit millisecond at any scale
  • Global Distribution β€” Multi-region replication with global tables

DynamoDB scales to millions of requests per second with zero operational overhead.

DynamoDB Architecture

DfAmazon DynamoDB

Amazon DynamoDB is a fully managed, serverless, key-value and document NoSQL database. It provides consistent, single-digit millisecond latency at any scale. DynamoDB automatically partitions data across servers and supports both eventually consistent and strongly consistent reads.

Data Model

ConceptDescription
TableCollection of items (analogous to a table in SQL)
ItemA group of attributes (analogous to a row)
AttributeA key-value pair (analogous to a column)
Primary KeyUnique identifier for each item (partition key + optional sort key)

DynamoDB Data Model

// Table: Users
{
  "PK": "USER#123",           // Partition Key
  "SK": "PROFILE",            // Sort Key
  "name": "Alice Johnson",
  "email": "alice@example.com",
  "created_at": "2024-01-15",
  "plan": "premium"
}

// Table: Orders (single table design)
{
  "PK": "USER#123",           // Partition Key
  "SK": "ORDER#2024-01-15#001", // Sort Key
  "product_id": "PROD#456",
  "amount": 99.99,
  "status": "shipped"
}

Partitioning

DfDynamoDB Partitioning

DynamoDB automatically partitions data based on the partition key. Each partition stores a contiguous range of keys and handles a proportional share of traffic. A good partition key has high cardinality and distributes traffic evenly across partitions.

Partition Key Distribution

Ppartition=hash(PK)mod  NpartitionsP_{partition} = hash(PK) \mod N_{partitions}

Here,

  • PpartitionP_{partition}=Partition assigned to the item
  • PKPK=The partition key value
  • NpartitionsN_{partitions}=Number of partitions

A common anti-pattern is choosing a partition key with low cardinality (e.g., "status" with only a few values). This creates hot partitions where most traffic hits a single partition. Choose partition keys with high cardinality for uniform distribution.

Single Table Design

DfSingle Table Design

Single table design stores multiple entity types in one DynamoDB table using composite primary keys (PK + SK). This enables efficient access patterns across related entities without JOINs. The trade-off is a more complex data model but fewer tables to manage.

EntityPKSKAttributes
UserUSER#123PROFILEname, email
OrderUSER#123ORDER#2024-01-15amount, status
ProductPRODUCT#456METADATAname, price
ReviewPRODUCT#456REVIEW#USER#123rating, text

Secondary Indexes

DfGlobal Secondary Index (GSI)

A Global Secondary Index is a separate index with its own partition key and optional sort key. It enables queries on non-key attributes at the cost of eventually consistent reads and additional storage cost.

DfLocal Secondary Index (LSI)

A Local Secondary Index shares the partition key with the base table but uses a different sort key. It provides strongly consistent reads but must be defined at table creation time.

Index TypePartition KeySort KeyConsistencyCost
GSIDifferent from baseOptionalEventually consistentExtra storage + throughput
LSISame as baseDifferentStrongly consistentExtra storage only

Design your access patterns first, then choose the partition key and sort key to support them. GSI projections determine which attributes are copied to the indexβ€”project only what you need to minimize cost.

DynamoDB Streams

DfDynamoDB Streams

DynamoDB Streams capture a time-ordered sequence of item-level modifications (create, update, delete) in a DynamoDB table. The stream data is available for 24 hours and can trigger AWS Lambda functions for event-driven processing.

Use CasePattern
Cross-region replicationStream β†’ Lambda β†’ write to other region
Event-driven workflowsStream β†’ Lambda β†’ trigger Step Functions
Materialized viewsStream β†’ Lambda β†’ update derived tables
Audit loggingStream β†’ Kinesis β†’ S3 β†’ Athena

Global Tables

DfDynamoDB Global Tables

Global tables provide a fully managed, multi-region, multi-active replication solution. They enable fast, local reads and writes in any region with eventual consistency across regions. Global tables are ideal for applications that need low-latency access from multiple geographic locations.

FeatureDescription
Multi-activeRead and write in any region
Eventual consistencyReplication across regions is async
Conflict resolutionLast-writer-wins (LWW)
AutomaticNo manual setup for replication

Capacity Modes

ModeDescriptionBest For
On-demandPay per request, auto-scalesUnpredictable workloads
ProvisionedReserve read/write capacityPredictable workloads
Auto-scalingAdjusts provisioned capacityVariable but patterned workloads

DynamoDB Capacity Calculation

RCU=ReadSizebytes4KBΓ—ConsistencyFactorRCU = \frac{ReadSize_{bytes}}{4KB} \times ConsistencyFactor

Here,

  • RCURCU=Read Capacity Units needed
  • ReadSizebytesReadSize_{bytes}=Item size in bytes
  • 4KB4KB=One RCU per 4KB for strongly consistent read
  • ConsistencyFactorConsistencyFactor=1 for strongly consistent, 0.5 for eventually consistent

Practice Exercises

  1. Table Design: Design a single-table DynamoDB schema for a ride-sharing app with users, drivers, rides, and payments. Identify all access patterns and choose appropriate PK/SK combinations.

  2. Partition Key Analysis: You have a DynamoDB table with 100M items and the partition key is "country". Analyze the access pattern and identify potential hot partitions. Propose a solution.

  3. Stream Processing: Design an event-driven workflow using DynamoDB Streams that sends a notification when an order status changes to "shipped".

  4. Cost Estimation: Estimate the monthly cost for a DynamoDB table with 100GB of data, 10K read capacity units, and 5K write capacity units.

Key Takeaways:

  • DynamoDB is a serverless, fully managed NoSQL database with predictable performance
  • Single table design stores multiple entity types using composite primary keys
  • Choose partition keys with high cardinality for uniform distribution
  • GSIs enable queries on non-key attributes; LSIs share the partition key
  • DynamoDB Streams enable event-driven workflows with Lambda
  • Global tables provide multi-region, multi-active replication

What to Learn Next

-> Redis Deep Dive Redis data structures, persistence, clustering, and use cases.

-> Cassandra Deep Dive Cassandra architecture, data modeling, and operational patterns.

-> Spanner and CockroachDB Deep dive into specific NewSQL implementations.

-> NoSQL Deep Dive Document, key-value, column-family, and graph databases overview.

-> Data Partitioning Sharding strategies, consistent hashing, and partition keys.

-> Choosing the Right Database Systematic framework for database selection.

⭐

Premium Content

DynamoDB Deep Dive

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert System Design Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement