πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Monitoring: Azure Monitor, Log Analytics & Alerts

Azure Data EngineeringMonitoring⭐ Premium

Advertisement

Monitoring: Azure Monitor, Log Analytics & Alerts

Enterprise monitoring for data engineering with Azure Monitor, Log Analytics, alerts, and workbooks

Monitoring Architecture

Architecture Diagram
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    MONITORING ARCHITECTURE                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                     β”‚
β”‚  DATA SOURCES              COLLECTION           ANALYSIS            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ Synapse  │───────────>β”‚ Diagnostic   │────>β”‚ Log Analyticsβ”‚   β”‚
β”‚  β”‚ Databricksβ”‚           β”‚ Settings     β”‚     β”‚ Workspace    β”‚   β”‚
β”‚  β”‚ ADF      │───────────>β”‚              β”‚     β”‚              β”‚   β”‚
β”‚  β”‚ ADLS Gen2β”‚            β”‚ Send to:     β”‚     β”‚ KQL Queries  β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚ β€’ Log Analyt.β”‚     β”‚ Dashboards   β”‚   β”‚
β”‚                          β”‚ β€’ Storage    β”‚     β”‚              β”‚   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”            β”‚ β€’ Event Hub  β”‚     β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚  β”‚ Custom   │───────────>β”‚              β”‚            β”‚            β”‚
β”‚  β”‚ Metrics  β”‚            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜            β”‚            β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                        β”‚            β”‚
β”‚                                                      β–Ό            β”‚
β”‚  VISUALIZATION             ALERTING              AUTOMATION       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚ Azure        β”‚<──────>β”‚ Azure        │────>β”‚ Logic Apps   β”‚  β”‚
β”‚  β”‚ Monitor      β”‚        β”‚ Alerts       β”‚     β”‚ (Remediation)β”‚  β”‚
β”‚  β”‚ Workbooks    β”‚        β”‚              β”‚     β”‚              β”‚  β”‚
β”‚  β”‚              β”‚        β”‚ β€’ Metric     β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β”‚ Dashboards   β”‚        β”‚ β€’ Log        β”‚                       β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚ β€’ Activity   β”‚                       β”‚
β”‚                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

KQL Queries for Data Engineering

// ADF Pipeline runs summary
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "PipelineRuns"
| summarize 
    TotalRuns = count(),
    SuccessfulRuns = countif(Status_s == "Succeeded"),
    FailedRuns = countif(Status_s == "Failed")
    by bin(TimeGenerated, 1h)
| render timechart

// Synapse SQL Pool query performance
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SYNAPSE"
| where Category == "SQLRequest"
| where DurationMs_d > 1000
| project QueryText_s, DurationMs_d, RequestTime_d, User_s
| order by DurationMs_d desc

// ADLS Gen2 storage usage
AzureMetrics
| where ResourceProvider == "MICROSOFT.STORAGE"
| where MetricName == "UsedCapacity"
| summarize AvgStorageGB = avg(Average) / 1024 / 1024 / 1024 
    by bin(TimeGenerated, 1d)
| render timechart

// Databricks cluster metrics
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATABRICKS"
| where Category == "clusters"
| summarize 
    ActiveClusters = countif(State_s == "RUNNING"),
    TotalDBUs = sum(TotalDBUs_d)
    by bin(TimeGenerated, 1h)
| render timechart

Alert Rules Configuration

{
  "properties": {
    "displayName": "ADF Pipeline Failure Alert",
    "severity": 2,
    "enabled": true,
    "scopes": [
      "/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.DataFactory/factories/adf-prod"
    ],
    "condition": {
      "allOf": [
        {
          "field": "name",
          "equals": "FailedPipelineRuns"
        },
        {
          "field": "Microsoft.DataFactory/factories/pipelineRuns/Status",
          "equals": "Failed"
        }
      ]
    },
    "actions": {
      "actionGroups": [
        "/subscriptions/xxx/resourceGroups/rg/providers/Microsoft.Insights/actionGroups/ag-data-team"
      ]
    },
    "evaluationFrequency": "PT5M",
    "windowSize": "PT15M"
  }
}

ℹ️

Pro Tip: Create custom Azure Monitor Workbooks for data engineering dashboards. Include metrics for pipeline runs, data volumes, query performance, and cost trends.

Interview Questions

Q1: What metrics should you monitor for a data engineering platform? A: 1) Pipeline success/failure rates, 2) Data volumes processed, 3) Query performance (duration, resources), 4) Storage utilization, 5) Cost trends, 6) Data freshness, 7) Error rates and types.

Q2: How do you implement end-to-end monitoring for an ADF pipeline? A: 1) Enable diagnostic settings for ADF, 2) Create Log Analytics workspace, 3) Build KQL queries for pipeline metrics, 4) Create Azure Monitor Workbooks, 5) Set up alert rules for failures, 6) Implement custom logging in activities.

Q3: What is the difference between metrics and logs in Azure Monitor? A: Metrics are numerical time-series data (CPU, memory, throughput). Logs are detailed event records (pipeline runs, errors, query text). Use metrics for real-time monitoring; logs for debugging and analysis.

Advertisement