🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Levels of Measurement — Nominal, Ordinal, Interval, Ratio

Foundations of StatisticsMeasurement Theory🟢 Free Lesson

Advertisement

Levels of Measurement

Measurement Theory

Why the Scale You Choose Changes Everything

In 1946, psychologist Stanley Stevens proposed four levels of measurement that define what statistics you can legally perform on your data. The level of measurement is not a technicality — it is a mathematical constraint. Applying mean to ordinal data or ratios to interval data is not just wrong, it is meaningless.

Here is what mastering levels of measurement helps you do:

  • Choose Valid Statistics — Know exactly which central tendency, spread, and correlation measures are mathematically appropriate for each scale.
  • Select the Right Test — Match your data to the correct hypothesis test: chi-square for nominal, Mann-Whitney for ordinal, t-test for interval/ratio.
  • Avoid Invalid Conclusions — Stop computing averages on categories, ratios on temperatures, and standard deviations on zip codes.
  • Communicate Precisely — Describe your variables with the exact terminology statisticians expect.

The level of measurement is not a label you attach after analysis — it is a property of the data itself that constrains everything you do.


Levels of Measurement

Definition

DfLevels of Measurement

In 1946, psychologist Stanley Stevens proposed a taxonomy of four levels of measurement that has since become foundational in statistics. The level determines which statistical operations are mathematically valid.


The Four Levels

NominalIdentity onlyColor, GenderMode, Chi-squareOrdinalIdentity + OrderRating, RankMedian, SpearmanInterval+ Equal IntervalsTemp (°C), IQMean, Pearson rRatio+ True ZeroHeight, WeightAll statisticsEach level inherits all properties of the levels below it

1. Nominal Scale

The weakest level. Data is placed into named categories with no meaningful order or distance between them.

Properties:

  • Identity: each value belongs to a distinct category
  • No order, no distance, no meaningful zero

Examples: Gender, blood type, nationality, color, product ID, political party

Valid statistics: Frequency, mode, chi-square test
Invalid: Mean, median, standard deviation

Blood Type Distribution (Nominal Data)
import pandas as pd
from scipy.stats import chi2_contingency

# Nominal: blood type distribution
blood_types = pd.Series(['A', 'O', 'B', 'AB', 'O', 'A', 'O', 'A', 'B', 'O'])
print("Mode:", blood_types.mode()[0])
print(blood_types.value_counts())
# Chi-square test of independence (nominal vs nominal)

2. Ordinal Scale

Categories have a meaningful order, but the intervals between categories are unknown or unequal.

Properties:

  • Identity + Order
  • No distance, no meaningful zero

Examples:

  • Survey Likert scales (Strongly Disagree -> Strongly Agree)
  • Education level (High School < Bachelor's < Master's < PhD)
  • Race finishing position (1st, 2nd, 3rd)
  • Socioeconomic status (Low, Middle, High)

Valid statistics: Median, IQR, percentiles, Spearman rank correlation, Mann-Whitney test
Invalid: Arithmetic mean (debated), standard deviation, Pearson r

Survey Response Distribution (Ordinal Data)
import numpy as np
from scipy.stats import spearmanr

# Ordinal: race positions
team_a = [1, 3, 5, 7]   # positions team A finished
team_b = [2, 4, 6, 8]   # positions team B finished

# Spearman correlation (rank-based — appropriate for ordinal)
rho, p = spearmanr(team_a, team_b)
print(f"Spearman ρ = {rho:.3f}, p = {p:.4f}")

# Median is appropriate for ordinal
satisfaction = [3, 4, 2, 5, 4, 3, 4, 5, 2, 4]  # 1–5 scale
print(f"Median satisfaction: {np.median(satisfaction)}")

3. Interval Scale

Equal intervals between values, but no true zero — zero is arbitrary, not the absence of the quantity.

Properties:

  • Identity + Order + Equal Intervals
  • No true zero (ratios meaningless)

Examples:

  • Temperature in Celsius or Fahrenheit (0°C ≠ "no temperature")
  • IQ scores (IQ 0 doesn't mean no intelligence)
  • Calendar years (Year 0 is arbitrary)
  • Likert scales (when treated as interval — common in practice)

Valid statistics: Mean, standard deviation, Pearson r, t-tests, ANOVA
Invalid: Ratios ("twice as hot" is not meaningful in Celsius)

IQ Score Distribution (Interval Data)
# Temperature conversion — shows why ratios fail for interval data
celsius_a = 20
celsius_b = 40

# It is NOT true that 40°C is "twice as hot" as 20°C
# Convert to Kelvin (ratio scale) to see why:
kelvin_a = celsius_a + 273.15  # 293.15 K
kelvin_b = celsius_b + 273.15  # 313.15 K

ratio_celsius = celsius_b / celsius_a        # 2.0 — misleading!
ratio_kelvin  = kelvin_b / kelvin_a          # 1.068 — true ratio

print(f"Celsius ratio: {ratio_celsius:.3f}  <- NOT meaningful")
print(f"Kelvin ratio:  {ratio_kelvin:.3f}  <- Meaningful thermodynamic ratio")

4. Ratio Scale

The strongest level. Has all properties of interval scale plus a true absolute zero (zero means absence of the attribute).

Properties:

  • Identity + Order + Equal Intervals + True Zero

Examples:

  • Height, weight, length (0 kg = no mass)
  • Age, time duration
  • Income (0 = no income)
  • Temperature in Kelvin
  • Number of items (count data)

Valid statistics: All statistics including geometric mean, coefficient of variation, and ratio comparisons.

Height Distribution (Ratio Data)
import numpy as np

heights_m = np.array([1.65, 1.72, 1.80, 1.58, 1.90])

print(f"Mean: {np.mean(heights_m):.3f} m")
print(f"Ratio (tallest/shortest): {heights_m.max()/heights_m.min():.3f}")
print(f"Geometric mean: {np.exp(np.log(heights_m).mean()):.3f} m")
print(f"CV (coeff of variation): {(np.std(heights_m)/np.mean(heights_m)*100):.1f}%")
# All valid because height is ratio scale

Summary Table

LevelOrderEqual IntervalsTrue ZeroExampleAppropriate Mean
NominalEye colorMode
OrdinalSatisfaction ratingMedian
IntervalTemperature (°C)Arithmetic mean
RatioHeight, weightGeometric mean possible

Choosing the Right Statistical Test

def suggest_test(level_of_measurement, n_groups, paired=False):
    """Suggest appropriate statistical test based on measurement level."""
    if level_of_measurement == 'nominal':
        return "Chi-square test (categories) or Fisher's exact test (small samples)"
    elif level_of_measurement == 'ordinal':
        if n_groups == 2:
            return "Mann-Whitney U (independent) or Wilcoxon signed-rank (paired)"
        else:
            return "Kruskal-Wallis (independent) or Friedman (repeated measures)"
    elif level_of_measurement in ('interval', 'ratio'):
        if n_groups == 1:
            return "One-sample t-test"
        elif n_groups == 2:
            return "Independent t-test" if not paired else "Paired t-test"
        else:
            return "One-way ANOVA" if not paired else "Repeated measures ANOVA"

# Examples
print(suggest_test('nominal', 2))
print(suggest_test('ordinal', 2))
print(suggest_test('ratio', 2, paired=False))
print(suggest_test('ratio', 3))

Measurement Levels in Machine Learning & LLMs

NominalOne-Hot / EmbeddingOrdinalLabel / OrdinalEncIntervalStandardScalerRatioStandardScaler / MinMaxLLMs treat text as categorical (token IDs) — each token is nominal data
LevelML EncodingLLM/Deep LearningExample
NominalOne-Hot, LabelToken embeddingCategory: [1,0,0,0]
OrdinalOrdinal encodingLearned embeddingRating: 1→0.2, 5→1.0
IntervalStandardScalerBatch normalizationTemperature: (x-μ)/σ
RatioStandardScaler, LogLog transformIncome: log(x+1)

Example — LLM Tokenization (Nominal Data):

# LLMs treat every token as a nominal category
# Each token gets an embedding vector (learned representation)

# Simple example of token embedding
import numpy as np

# Vocabulary: each word is a nominal category
vocab = {"the": 0, "cat": 1, "sat": 2, "on": 3, "mat": 4}

# Embedding layer maps nominal IDs to dense vectors
# (In practice, learned during training)
embedding_dim = 4
np.random.seed(42)
embeddings = np.random.randn(5, embedding_dim)  # 5 tokens × 4 dimensions

# "cat" has token ID 1
cat_embedding = embeddings[vocab["cat"]]
print(f"Token 'cat' (ID={vocab['cat']}):")
print(f"Embedding vector: {cat_embedding.round(3)}")
print(f"Vector dimension: {len(cat_embedding)}")

# Similarity between tokens (cosine similarity)
from numpy.linalg import norm
sim_cat_sat = np.dot(embeddings[1], embeddings[2]) / (norm(embeddings[1]) * norm(embeddings[2]))
print(f"\nSimilarity(cat, sat): {sim_cat_sat:.3f}")

Output:

Architecture Diagram
Token 'cat' (ID=1):
Embedding vector: [ 0.497 -0.139  0.648  1.523]
Vector dimension: 4

Similarity(cat, sat): 0.234

Key Takeaways

Nominal data uses frequency, mode, and chi-square — never means or standard deviations.

ML encoding strategy depends entirely on measurement level — wrong encoding breaks models.

LLMs treat text as nominal data (token IDs) and learn embeddings for each token.

Always check: can I take ratios? Is there a true zero? This determines valid statistics.

"When in doubt, use conservative lower-level methods — they are always more robust than you think."


What to Learn Next

-> Types of Data Qualitative vs quantitative — the foundation of all data classification.

-> Frequency Distributions Organize raw data into tables and charts.

-> Mean, Median, Mode Which measure of center is appropriate for each level?

-> Standard Deviation Spread that works for interval and ratio data.

-> Correlation Pearson r for interval/ratio, Spearman for ordinal.

-> Hypothesis Testing Choose the right test based on your measurement level.

Premium Content

Levels of Measurement — Nominal, Ordinal, Interval, Ratio

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement