πŸŽ‰ 75% of content is free forever β€” Unlock Premium from $10/mo β†’
CW
Search courses…
πŸ’Ό Servicesℹ️ Aboutβœ‰οΈ ContactView Pricing Plansfrom $10

Data Quality Testing

🟒 Free Lesson

Advertisement

Data Quality Testing

Testing Architecture

Test Categories

Architecture Diagram
+-----------------------------------------------------------------------------+
|                     TEST CATEGORIES AND COVERAGE                            |
+-----------------------------------------------------------------------------+
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |                    STRUCTURAL TESTS                                  |   |
|  |                                                                     |    |
|  |  * Schema validation    * Column existence    * Data type checks   |     |
|  |  * Table existence      * View definitions    * Index verification |     |
|  +---------------------------------------------------------------------+    |
|                              |                                              |
|                              v                                              |
|  +---------------------------------------------------------------------+    |
|  |                    CONTENT TESTS                                    |    |
|  |                                                                     |    |
|  |  * Uniqueness checks    * Not-null validation * Accepted values   |      |
|  |  * Range validation     * Pattern matching    * Statistical tests |      |
|  +---------------------------------------------------------------------+    |
|                              |                                              |
|                              v                                              |
|  +---------------------------------------------------------------------+    |
|  |                    REFERENTIAL TESTS                                 |   |
|  |                                                                     |    |
|  |  * Foreign key validity * Orphan detection    * Relationship tests |     |
|  |  * Cross-table sync     * ID mapping          * Link validation    |     |
|  +---------------------------------------------------------------------+    |
|                              |                                              |
|                              v                                              |
|  +---------------------------------------------------------------------+    |
|  |                    FRESHNESS TESTS                                   |   |
|  |                                                                     |    |
|  |  * Source freshness     * Update frequency    * SLA monitoring     |     |
|  |  * Latency checks       * Delay detection     * Alert thresholds   |     |
|  +---------------------------------------------------------------------+    |
|                                                                             |
+-----------------------------------------------------------------------------+

Test Results Dashboard

Architecture Diagram
+-----------------------------------------------------------------------------+
|                     TEST RESULTS SUMMARY                                    |
+-----------------------------------------------------------------------------+
|                                                                             |
|  MODEL: fct_orders                                                          |
|  ===========================================================================|
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |  STRUCTURAL TESTS                                                  |     |
|  |  βœ“ Table exists                                                     |    |
|  |  βœ“ All columns present                                              |    |
|  |  βœ“ Data types correct                                               |    |
|  +---------------------------------------------------------------------+    |
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |  CONTENT TESTS                                                     |     |
|  |  βœ“ unique_order_id          (0 duplicates)                         |     |
|  |  βœ“ not_null_order_id        (0 nulls)                              |     |
|  |  βœ“ not_null_customer_id     (0 nulls)                              |     |
|  |  βœ“ valid_status             (all values valid)                     |     |
|  |  βœ“ positive_amount          (142 negative values)  FAILED         |      |
|  +---------------------------------------------------------------------+    |
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |  REFERENTIAL TESTS                                                 |     |
|  |  βœ“ fk_customer_id           (all valid)                            |     |
|  |  βœ“ fk_product_id            (all valid)                            |     |
|  +---------------------------------------------------------------------+    |
|                                                                             |
|  +---------------------------------------------------------------------+    |
|  |  FRESHNESS TESTS                                                   |     |
|  |  βœ“ source_freshness         (last update: 2 hours ago)             |     |
|  +---------------------------------------------------------------------+    |
|                                                                             |
|  SUMMARY: 12 passed, 1 failed, 0 errors                                     |
|  COVERAGE: 92%                                                              |
|                                                                             |
+-----------------------------------------------------------------------------+

Detailed Explanation

Data quality testing in dbt ensures that your transformed data meets business requirements and maintains integrity across the pipeline.


What are the Quality Dimensions?

DimensionDescription
AccuracyData correctly represents real-world entities
CompletenessAll required data is present
ConsistencyData is uniform across systems
TimelinessData is available when needed
UniquenessNo duplicate records exist
ValidityData conforms to defined formats and rules

What are the Test Types?

Structural Tests

Validate the structure of your data:

  • Table existence
  • Column presence
  • Data type correctness

Content Tests

Validate the content of your data:

  • Uniqueness constraints
  • Not-null requirements
  • Accepted value ranges
  • Pattern matching

Referential Tests

Validate relationships between tables:

  • Foreign key validity
  • Orphan detection
  • Cross-table consistency

Freshness Tests

Validate data timeliness:

  • Source freshness checks
  • Update frequency monitoring
  • SLA compliance

How do Test Execution work?

dbt tests work by:

  1. Compiling test SQL queries
  2. Executing against the target warehouse
  3. Checking for failing rows
  4. Reporting pass/fail results

What are Custom Test Patterns?

Custom tests enable domain-specific validations:

  • Business rule assertions
  • Statistical validations
  • Cross-system comparisons
  • Aggregation checks

Key Takeaway: Data quality testing covers six key dimensionsβ€”accuracy, completeness, consistency, timeliness, uniqueness, and validityβ€”ensuring data meets business requirements.

Code Examples

Schema Test Definitions

# models/staging/_sources.yml
version: 2

sources:
  - name: raw
    database: raw_data
    schema: public
    
    freshness:
      warn_after: {count: 6, period: hour}
      error_after: {count: 24, period: hour}
    
    tables:
      - name: orders
        columns:
          - name: id
            data_tests:
              - unique
              - not_null
          
          - name: customer_id
            data_tests:
              - not_null
              - relationships:
                  to: ref('stg_customers')
                  field: customer_id
          
          - name: status
            data_tests:
              - accepted_values:
                  values: ['pending', 'shipped', 'delivered', 'cancelled']
          
          - name: amount
            data_tests:
              - not_null

Custom Data Tests

-- tests/test_order_amount_positive.sql
-- Business rule: Order amounts must be positive

select
    order_id,
    amount,
    customer_id,
    order_date
from {{ ref('fct_orders') }}
where amount < 0

Advanced Custom Test

-- tests/test_monthly_revenue_anomaly.sql
-- Statistical test for revenue anomalies

with monthly_revenue as (
    select
        date_trunc('month', order_date) as month,
        sum(amount) as revenue
    from {{ ref('fct_orders') }}
    group by 1
),

revenue_stats as (
    select
        avg(revenue) as avg_revenue,
        stddev(revenue) as stddev_revenue
    from monthly_revenue
),

anomalies as (
    select
        mr.month,
        mr.revenue,
        rs.avg_revenue,
        rs.stddev_revenue,
        (mr.revenue - rs.avg_revenue) / nullif(rs.stddev_revenue, 0) as z_score
    from monthly_revenue mr
    cross join revenue_stats rs
)

select
    month,
    revenue,
    avg_revenue,
    z_score
from anomalies
where abs(z_score) > 3  -- Flag anomalies beyond 3 standard deviations

Generic Test Macro

-- macros/schema_tests/test_accepted_range.sql
{% test accepted_range(model, column_name, min_value, max_value) %}

with validation as (
    select
        {{ column_name }} as field_name,
        case
            when {{ column_name }} < {{ min_value }} then 'below_min'
            when {{ column_name }} > {{ max_value }} then 'above_max'
            else 'valid'
        end as validation_status
    from {{ model }}
    where {{ column_name }} is not null
)

select *
from validation
where validation_status != 'valid'

{% endtest %}

Test Configuration

# dbt_project.yml
name: 'my_project'
version: '1.0.0'

tests:
  my_project:
    +severity: error
    +store_failures: true
    +schema: test_results
    
    staging:
      +severity: warn
    
    marts:
      +severity: error
      +fail_calc: "count(*)"
      +warn_if: ">10"
      +error_if: ">100"

on-run-end:
  - "{{ log('Test execution completed', info=True) }}"

Freshness Configuration

# models/staging/_sources.yml
version: 2

sources:
  - name: shopify
    database: raw
    schema: shopify
    loader: fivetran
    loaded_at_field: _fivetran_synced
    
    freshness:
      warn_after: {count: 6, period: hour}
      error_after: {count: 24, period: hour}
    
    tables:
      - name: orders
        description: "Shopify orders"
        
        loaded_at_field: _fivetran_synced
        
        freshness:
          warn_after: {count: 1, period: hour}
          error_after: {count: 4, period: hour}

Performance Metrics

Test TypeExecution TimeResource UsageCoverage
uniqueLowLowHigh
not_nullLowLowHigh
accepted_valuesMediumLowMedium
relationshipsMediumMediumHigh
freshnessLowLowHigh
Custom SQLVariableVariableVariable

Best Practices

  1. Test all critical columns - uniqueness, not_null, relationships
  2. Use relationships tests - validate foreign key integrity
  3. Test business rules - custom assertions for domain logic
  4. Store test failures - for debugging and analysis
  5. Use severity levels - warn vs error based on impact
  6. Test freshness - ensure data is current
  7. Use packages - dbt_expectations for advanced tests
  8. Document tests - explain what each test validates
  9. Run tests in CI/CD - catch issues before production
  10. Monitor test results - track quality metrics over time

See Also

⭐

Premium Content

Data Quality Testing

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
πŸ’ΌInterview Prep
πŸ“œCertificates
🀝Community Access

Already a member? Log in

Need Expert dbt Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement