🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Range and IQR — Measures of Spread Explained

Foundations of StatisticsDescriptive Statistics🟢 Free Lesson

Advertisement

Range and IQR

Descriptive Statistics

The Simplest Measures of How Spread Out Your Data Is

Measures of spread tell us how scattered the data is. Range and IQR are the simplest measures — they use only specific order statistics.

  • Range — The difference between max and min; simple but brutally sensitive to outliers
  • IQR — The middle 50% of data; robust and reliable for skewed distributions
  • Outlier detection — The 1.5 times IQR rule flags suspicious values automatically
  • Box plot foundation — The IQR forms the box in every box plot you will ever make

Spread matters as much as center. Two datasets with the same mean can behave very differently.


What are Range and IQR?

Definition

Measures of spread (dispersion) tell us how scattered the data is. Range and IQR are the simplest measures — they use only specific order statistics.

Range

Range

Range=xmaxxmin\text{Range} = x_{\max} - x_{\min}

Here,

  • xmaxx_{\max}=Maximum value in the dataset
  • xminx_{\min}=Minimum value in the dataset

Simple but highly sensitive to outliers — one extreme value changes it completely.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

np.random.seed(42)
data = np.array([12, 15, 14, 10, 18, 20, 16, 11, 13, 17])
data_with_outlier = np.append(data, 100)

print(f"Data: {sorted(data)}")
print(f"Range = {data.max()} - {data.min()} = {data.max() - data.min()}")
print(f"\nWith outlier (100 added):")
print(f"Range = {data_with_outlier.max()} - {data_with_outlier.min()} = {data_with_outlier.max() - data_with_outlier.min()}")
print("Range nearly quadrupled due to one outlier!")

Interquartile Range (IQR)

Interquartile Range

IQR=Q3Q1IQR = Q3 - Q1

Here,

  • Q3Q3=Third quartile (75th percentile)
  • Q1Q1=First quartile (25th percentile)

The range of the middle 50% of the data. Robust to outliers.

# Computing quartiles and IQR
def five_number_summary(data):
    q1, q2, q3 = np.percentile(data, [25, 50, 75])
    iqr = q3 - q1
    lower_fence = q1 - 1.5 * iqr
    upper_fence = q3 + 1.5 * iqr
    
    print(f"Min:    {data.min():.2f}")
    print(f"Q1:     {q1:.2f}")
    print(f"Median: {q2:.2f}")
    print(f"Q3:     {q3:.2f}")
    print(f"Max:    {data.max():.2f}")
    print(f"IQR:    {iqr:.2f}")
    print(f"Lower fence (Q1 - 1.5×IQR): {lower_fence:.2f}")
    print(f"Upper fence (Q3 + 1.5×IQR): {upper_fence:.2f}")
    return q1, q2, q3, iqr

print("=== Normal data ===")
five_number_summary(data)
print("\n=== Data with outlier ===")
five_number_summary(data_with_outlier)
print("IQR barely changed — robust!")

Visualizing Range and IQR

# Two datasets with same mean and range but different IQR
np.random.seed(0)
dataset_a = np.random.uniform(0, 100, 200)  # Uniform: large IQR
dataset_b = np.random.normal(50, 10, 200)   # Normal: smaller IQR

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

for ax, data, label, color in zip(axes, 
                                    [dataset_a, dataset_b], 
                                    ['Uniform', 'Normal'], 
                                    ['steelblue', 'coral']):
    ax.hist(data, bins=30, color=color, edgecolor='black', alpha=0.7, density=True)
    q1, q2, q3 = np.percentile(data, [25, 50, 75])
    ax.axvline(data.min(), color='gray', linestyle=':', label=f'Min={data.min():.0f}')
    ax.axvline(q1, color='blue', linestyle='--', label=f'Q1={q1:.0f}')
    ax.axvline(q2, color='red', linestyle='-', linewidth=2, label=f'Median={q2:.0f}')
    ax.axvline(q3, color='blue', linestyle='--', label=f'Q3={q3:.0f}')
    ax.axvline(data.max(), color='gray', linestyle=':', label=f'Max={data.max():.0f}')
    ax.fill_betweenx([0, ax.get_ylim()[1] if ax.get_ylim()[1] > 0 else 0.05], 
                      q1, q3, alpha=0.2, color='yellow', label=f'IQR={q3-q1:.0f}')
    ax.set_title(f'{label} Distribution\nRange={data.max()-data.min():.0f}, IQR={q3-q1:.0f}')
    ax.legend(fontsize=7)

plt.tight_layout()
plt.savefig('range_iqr.png', dpi=150)
plt.show()

Comparing Spread Measures

MeasureFormulaBreakdown PointSensitive To
RangeMax - Min0%Very sensitive to outliers
IQRQ3 - Q125%Robust
Std Dev√(Σ(xᵢ-x̄)²/(n-1))0%Sensitive to outliers
MADMedian(xᵢ - Median)

Range and IQR in Machine Learning

Outlier Detection1.5×IQR ruleFeature SelectionZero variance = uselessNormalizationMin-Max scalingBox PlotIQR = box widthRange and IQR are foundational for outlier detection and feature engineering in ML
ML ApplicationRange/IQR UsageWhy
Outlier detectionIQR fence = Q1-1.5×IQR to Q3+1.5×IQRRobust to skewed data
Feature selectionZero/near-zero range → remove featureNo information content
Min-Max normalizationScale to [0,1] using rangeNeural networks need bounded inputs
Box plotsIQR defines the boxVisual model diagnostics
Anomaly detectionIQR-based thresholdsProduction data monitoring
import numpy as np
from sklearn.preprocessing import MinMaxScaler

np.random.seed(42)

# IQR-based outlier detection
data = np.concatenate([np.random.normal(50, 10, 100), [200, -50]])
q1, q3 = np.percentile(data, [25, 75])
iqr = q3 - q1
lower, upper = q1 - 1.5*iqr, q3 + 1.5*iqr
outliers = data[(data < lower) | (data > upper)]
print(f"IQR: {iqr:.2f}, Fences: [{lower:.2f}, {upper:.2f}]")
print(f"Outliers detected: {len(outliers)} ({outliers})")

# Min-Max normalization using range
data_features = np.random.randn(100, 3) * [10, 1, 100]  # very different ranges
scaler = MinMaxScaler()
normalized = scaler.fit_transform(data_features)
print(f"\nOriginal ranges: {[f'{d.max()-d.min():.1f}' for d in data_features.T]}")
print(f"Normalized ranges: {[f'{d.max()-d.min():.3f}' for d in normalized.T]}")

Key Takeaways

Summary: Range and IQR

  1. Range is simple but useless with outliers — one bad data point ruins it
  2. IQR is the most robust simple spread measure — covers the middle 50%
  3. The 1.5×IQR rule for outlier detection is built into most box plot implementations
  4. For symmetric data without outliers, standard deviation is more informative than IQR
  5. For skewed data or data with outliers, report IQR instead of (or alongside) standard deviation
  6. IQR = 0 means at least 50% of data is identical — common in count data

Premium Content

Range and IQR — Measures of Spread Explained

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement