🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Percentiles and Quartiles — Calculation and Interpretation

Foundations of StatisticsDescriptive Statistics🟢 Free Lesson

Advertisement

Percentiles and Quartiles

Descriptive Statistics

Where Does Any Value Stand Relative to the Rest?

Percentiles tell you the relative standing of any value within a dataset. Quartiles are special percentiles that divide data into four equal parts.

  • Percentile rank — "You scored better than 85% of test takers"
  • Quartiles — Q1, Q2 (median), Q3 split data into four equal groups
  • Deciles — Ten equal groups for finer-grained comparison
  • Interpolation methods — Different calculators give slightly different answers; know why

Percentiles turn raw scores into meaningful rankings. They are the language of standardized testing and performance evaluation.


What are Percentiles and Quartiles?

Definition

The pth percentile is the value below which p% of observations fall. Quartiles (Q1=25th, Q2=50th, Q3=75th) are special cases.

Percentile Rank

Percentile Rank=Number of values below xn×100\text{Percentile Rank} = \frac{\text{Number of values below } x}{n} \times 100

Here,

  • xx=The value being ranked
  • nn=Total number of observations
import numpy as np
from scipy import stats
import pandas as pd

data = np.array([15, 20, 35, 40, 50, 12, 27, 45, 38, 22, 18, 55, 30, 42, 25])
sorted_d = np.sort(data)
print(f"Sorted: {sorted_d}")

for p in [10, 25, 50, 75, 90]:
    print(f"P{p:2d}: {np.percentile(data, p):.2f}")

NumPy Interpolation Methods

d = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
for method in ['linear', 'lower', 'higher', 'midpoint', 'nearest']:
    val = np.percentile(d, 50, interpolation=method)
    print(f"  method='{method}': {val}")

Interpolation Methods

NumPy supports multiple interpolation methods for percentiles: linear (default), lower, higher, midpoint, and nearest. The default 'linear' method is appropriate for most use cases.


Five-Number Summary

def five_num(data, label=''):
    q1, q2, q3 = np.percentile(data, [25, 50, 75])
    iqr = q3 - q1
    lower, upper = q1 - 1.5*iqr, q3 + 1.5*iqr
    if label: print(f"\n=== {label} ===")
    print(f"Min: {data.min():.2f}  Q1: {q1:.2f}  Median: {q2:.2f}  Q3: {q3:.2f}  Max: {data.max():.2f}")
    print(f"IQR: {iqr:.2f}  Fences: [{lower:.2f}, {upper:.2f}]")

np.random.seed(42)
exam = np.random.normal(75, 12, 200).clip(0, 100)
five_num(exam, "Exam Scores")
QuartilePercentileDescription
Q125thLower quartile — 25% of data falls below this
Q250thMedian — middle value of the dataset
Q375thUpper quartile — 75% of data falls below this

Percentile Rank

score = 88
rank = stats.percentileofscore(exam, score, kind='weak')
print(f"Score of {score} is at the {rank:.1f}th percentile")
print(f"{rank:.1f}% of students scored at or below {score}")

Deciles

deciles = np.percentile(exam, range(10, 100, 10))
for i, val in enumerate(deciles, 1):
    print(f"D{i} ({i*10}th pct): {val:.1f}")

Percentiles in Machine Learning

ML ApplicationPercentile UsageWhy
Quantile regressionPredict percentiles, not meanRobust to skewed targets
Feature binningCut into quantile binsDiscretize continuous features
Performance metricsP95 latency, P99 response timeSLA monitoring
Data preprocessingClip outliers at percentilesRobust scaling
import numpy as np
from sklearn.preprocessing import QuantileTransformer

np.random.seed(42)

# Quantile binning for feature engineering
data = np.random.lognormal(3, 1, 1000)
bins = np.percentile(data, [0, 25, 50, 75, 100])
binned = np.digitize(data, bins[1:-1])
print(f"Quantile bins: {bins.round(1)}")
print(f"Binned values: {np.bincount(binned)}")

# QuantileTransformer for normality
qt = QuantileTransformer(n_quantiles=100, output_distribution='normal')
transformed = qt.fit_transform(data.reshape(-1,1)).flatten()
print(f"\nOriginal skewness: {float(np.mean(((data-data.mean())/data.std())**3)):.3f}")
print(f"Transformed skewness: {float(np.mean(((transformed-transformed.mean())/transformed.std())**3)):.3f}")

Key Takeaways

Summary: Percentiles and Quartiles

  • P50 = median — percentiles generalize the median to any fraction
  • Quartiles divide data into 4 equal-frequency groups (not equal-width intervals)
  • IQR = Q3 − Q1 covers the middle 50% and drives outlier fences
  • Percentile rank answers "where does this value fall in the distribution?"
  • NumPy's default method='linear' is appropriate for most cases
  • Percentiles are non-parametric — no distributional assumptions needed

Premium Content

Percentiles and Quartiles — Calculation and Interpretation

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement