Azure Monitor: Metrics, Logs, Workbooks & Alerting
Enterprise monitoring with Azure Monitor metrics, logs, workbooks, and alerting for data engineering
Azure Monitor Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β AZURE MONITOR ARCHITECTURE β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β DATA COLLECTION DATA ANALYSIS ACTION β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Platform βββββββ>β Log Analyticsβββββ>β Azure β β
β β Metrics β β Workspace β β Monitor β β
β ββββββββββββββββ β β β Alerts β β
β β KQL Queries β β β β
β ββββββββββββββββ β Dashboards β β Metric β β
β β Diagnostic βββββββ>β β β Alerts β β
β β Logs β β Workbooks β β β β
β ββββββββββββββββ β β β Activity β β
β ββββββββββββββββ β Log Alerts β β
β ββββββββββββββββ β β β
β β Custom βββββββ>ββββββββββββββββ ββββββββββββββββ β
β β Metrics β β Application β β
β ββββββββββββββββ β Insights β ββββββββββββββββ β
β ββββββββββββββββ β Logic Apps β β
β β (Remediation)β β
β ββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
KQL Queries for Data Engineering
// Pipeline success rate over time
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "PipelineRuns"
| summarize
Total = count(),
Success = countif(Status_s == "Succeeded"),
Failed = countif(Status_s == "Failed")
by bin(TimeGenerated, 1h)
| extend SuccessRate = Success / Total * 100
| render timechart
// Synapse query performance
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.SYNAPSE"
| where Category == "SQLRequest"
| summarize AvgDuration = avg(DurationMs_d) by bin(TimeGenerated, 1h)
| render timechart
// ADLS Gen2 storage trends
AzureMetrics
| where ResourceProvider == "MICROSOFT.STORAGE"
| where MetricName == "UsedCapacity"
| summarize StorageGB = avg(Average) / 1024 / 1024 / 1024
by bin(TimeGenerated, 1d)
| render timechart
// Alert on pipeline failures
AzureDiagnostics
| where ResourceProvider == "MICROSOFT.DATAFACTORY"
| where Category == "PipelineRuns"
| where Status_s == "Failed"
| where TimeGenerated > ago(15m)
| count
| where Count > 5
Workbook Template
{
"category": "workbook",
"type": "workbook",
"serializedData": {
"version": "Notebook/1.0",
"items": [
{
"type": "metric",
"name": "ADF Pipeline Runs",
"metric": "Microsoft.DataFactory/factories/PipelineRuns",
"aggregationType": 4,
"timeGrain": "PT1H"
},
{
"type": "query",
"name": "Failed Pipelines",
"query": "AzureDiagnostics | where ResourceProvider == 'MICROSOFT.DATAFACTORY' | where Status_s == 'Failed' | summarize count() by PipelineName_s"
}
]
}
}
βΉοΈ
Pro Tip: Create custom Azure Monitor Workbooks for data engineering dashboards. Include pipeline metrics, storage utilization, query performance, and cost trends in a single view.
Interview Questions
Q1: How do you set up end-to-end monitoring for a data pipeline? A: 1) Enable diagnostic settings, 2) Send to Log Analytics, 3) Create KQL queries, 4) Build Workbooks, 5) Set up alerts, 6) Implement custom logging in activities.
Q2: What are the best practices for alert configuration? A: 1) Define clear severity levels, 2) Set appropriate thresholds, 3) Use action groups for notifications, 4) Implement auto-healing, 5) Test alerts regularly, 6) Document alert procedures.
Q3: How do you optimize Log Analytics costs? A: 1) Filter diagnostic logs, 2) Set appropriate retention, 3) Use Basic Logs for high-volume data, 4) Archive to storage for long-term, 5) Use dedicated clusters for high query volumes.