πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

MSK Connect Deep Dive

AWS Data EngineeringKafka Connect on AWS⭐ Premium

Advertisement

πŸ“¨ MSK Connect

Master MSK Connect with Kafka Connect patterns and connectors.

Module: AWS Data Engineering β€’ Topic 43 of 65 β€’ Premium Content

MSK Connect Architecture

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    MSK CONNECT ARCHITECTURE                                  β”‚
β”‚                                                                             β”‚
β”‚  Source Connector β†’ MSK Topic β†’ Sink Connector β†’ Destination               β”‚
β”‚                                                                             β”‚
β”‚  Source Connectors:                                                         β”‚
β”‚  β€’ Debezium (CDC from databases)                                            β”‚
β”‚  β€’ JDBC Source (poll-based)                                                 β”‚
β”‚  β€’ S3 Source (file ingestion)                                               β”‚
β”‚  β€’ DynamoDB Streams                                                         β”‚
β”‚                                                                             β”‚
β”‚  Sink Connectors:                                                           β”‚
β”‚  β€’ S3 Sink (data lake ingestion)                                            β”‚
β”‚  β€’ Redshift Sink                                                            β”‚
β”‚  β€’ Elasticsearch Sink                                                       β”‚
β”‚  β€’ HTTP Sink                                                                β”‚
β”‚                                                                             β”‚
β”‚  Single Message Transforms (SMTs):                                          β”‚
β”‚  β€’ RegexRouter (topic routing)                                              β”‚
β”‚  β€’ ExtractNewRecordState (flatten)                                          β”‚
β”‚  β€’ InsertField (add metadata)                                               β”‚
β”‚  β€’ ReplaceField (rename/remove)                                             β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Connector Examples

{
  "name": "s3-sink-connector",
  "config": {
    "connector.class": "io.confluent.connect.s3.S3SinkConnector",
    "tasks.max": "3",
    "topics.regex": "raw-(.*)",
    "s3.bucket.name": "data-lake-raw",
    "flush.size": "1000",
    "rotate.interval.ms": "300000",
    "format.class": "io.confluent.connect.s3.format.parquet.ParquetFormat",
    "parquet.codec": "snappy",
    "transforms": "route",
    "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
    "transforms.route.regex": "raw-(.*)",
    "transforms.route.replacement": "processed-$1"
  }
}

Interview Q&A

Q1: MSK Connect vs. self-managed Kafka Connect?

Answer: MSK Connect is managed, auto-scaling, no infrastructure. Self-managed gives more control over worker configuration.

Q2: What are SMTs?

Answer: Single Message Transforms modify records in-flight without a custom converter. Used for routing, field manipulation, and data enrichment.

Q3: How does MSK Connect scale?

Answer: Auto-scaling based on connector throughput. Configure min/max capacity for cost control.

Summary

  • MSK Connect: Managed Kafka Connect on AWS
  • Source Connectors: Debezium, JDBC, S3, DynamoDB
  • Sink Connectors: S3, Redshift, Elasticsearch, HTTP
  • SMTs: In-flight record transformation without custom code
  • Scaling: Auto-scaling based on connector throughput

Advertisement