🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Arithmetic Mean — Formula, Properties, Computation, Limitations

Foundations of StatisticsDescriptive Statistics🟢 Free Lesson

Advertisement

The Arithmetic Mean

Descriptive Statistics

The Most Used — and Most Misused — Statistical Measure

The arithmetic mean is the most widely used statistical measure in all of statistics. Understanding it deeply saves you from common analytical errors.

  • Algebraic properties — Sum of deviations equals zero; minimizes sum of squared deviations
  • Population vs sample — Why dividing by n-1 gives an unbiased estimator
  • Sensitivity to outliers — One extreme value can pull the mean far from the center
  • Trimmed and winsorized alternatives — Robust versions for contaminated data

The mean is powerful, but it is not always right. Know when to use it and when to look elsewhere.


What is the Arithmetic Mean?

Definition

The arithmetic mean of a set of values is the sum of the values divided by the number of values. It is the value that minimizes the sum of squared deviations from itself.


Definition and Formula

DfArithmetic Mean

The arithmetic mean of a set of values is the sum of the values divided by the number of values. It is the value that minimizes the sum of squared deviations from itself.

For a sample of nn observations:

Sample Mean

xˉ=x1+x2++xnn=1ni=1nxi\bar{x} = \frac{x_1 + x_2 + \cdots + x_n}{n} = \frac{1}{n}\sum_{i=1}^n x_i

Here,

  • xˉ\bar{x}=Sample mean
  • nn=Number of observations in the sample
  • xix_i=The i-th observation

For a population of NN observations:

Population Mean

μ=1Ni=1Nxi\mu = \frac{1}{N}\sum_{i=1}^N x_i

Here,

  • μ\mu=Population mean
  • NN=Population size

For a continuous random variable XX with probability density function f(x)f(x):

Expected Value (Population Mean)

μ=E[X]=xf(x)dx\mu = E[X] = \int_{-\infty}^{\infty} x \, f(x) \, dx

Here,

  • E[X]E[X]=Expected value of X
  • f(x)f(x)=Probability density function

Algebraic Properties of the Mean

ThFundamental Properties

  1. Sum of deviations = 0: i=1n(xixˉ)=0\sum_{i=1}^n (x_i - \bar{x}) = 0
    • The mean is the "balance point" of the distribution.
  2. Linear transformation: (aX+b)=aXˉ+b\overline{(aX + b)} = a\bar{X} + b
    • The mean commutes with affine transformations.
  3. Minimizes sum of squared deviations: xˉ=argminc(xic)2\bar{x} = \arg\min_c \sum(x_i - c)^2
    • The mean is the least squares estimator of location.
  4. Additivity for independent variables: E[X+Y]=E[X]+E[Y]E[X + Y] = E[X] + E[Y]
    • This holds even without independence.
  5. Homogeneity of degree 1: E[aX]=aE[X]E[aX] = aE[X]

Proof that the Mean Minimizes Sum of Squared Deviations

ThMean as Least Squares Estimator

Define f(c)=i=1n(xic)2f(c) = \sum_{i=1}^n (x_i - c)^2. Taking the derivative and setting it to zero:

f(c)=2i=1n(xic)=0    i=1nxi=nc    c=xˉf'(c) = -2\sum_{i=1}^n (x_i - c) = 0 \implies \sum_{i=1}^n x_i = nc \implies c = \bar{x}

The second derivative f(c)=2n>0f''(c) = 2n > 0 confirms this is a minimum. \square


Weighted Mean

When observations have different importance, frequency, or precision:

Weighted Mean

xˉw=i=1nwixii=1nwi\bar{x}_w = \frac{\sum_{i=1}^n w_i x_i}{\sum_{i=1}^n w_i}

Here,

  • wiw_i=Weight for the i-th observation
  • xix_i=The i-th observation
  • xˉw\bar{x}_w=Weighted mean

The ordinary mean is the special case wi=1/nw_i = 1/n for all ii. In the inverse-variance weighting scheme, wi=1/σi2w_i = 1/\sigma_i^2, which gives the minimum-variance unbiased estimator.


Mean for Grouped Data

When raw data is unavailable (only a frequency table):

Mean for Grouped Data

xˉ=i=1kfimii=1kfi\bar{x} = \frac{\sum_{i=1}^k f_i m_i}{\sum_{i=1}^k f_i}

Here,

  • fif_i=Frequency of the i-th class
  • mim_i=Midpoint of the i-th class interval
  • kk=Number of classes

This is an approximation — the exact mean cannot be recovered from grouped data.


Trimmed Mean: A Robust Alternative

DfTrimmed Mean

The α\alpha-trimmed mean removes the smallest α\alpha fraction and largest α\alpha fraction of observations before computing the mean:

xˉtrim=1n2αni=αn+1nαnx(i)\bar{x}_{\text{trim}} = \frac{1}{n - 2\lfloor \alpha n \rfloor} \sum_{i=\lfloor \alpha n \rfloor + 1}^{n - \lfloor \alpha n \rfloor} x_{(i)}

where x(i)x_{(i)} are the order statistics. Common choices: α=0.05\alpha = 0.05 or α=0.10\alpha = 0.10.

The trimmed mean trades a small amount of efficiency (under normality) for greatly improved robustness against outliers.


Influence Function and Breakdown Point

ThInfluence Function of the Mean

The influence function of the arithmetic mean is:

IF(x;xˉ,F)=xμ\text{IF}(x; \bar{x}, F) = x - \mu

This is unbounded — a single extreme observation can shift the mean arbitrarily. This is why the mean is not robust.

The breakdown point of the mean is 1/n1/n — a single observation at infinity can make the mean infinite.


Limitations of the Mean

ProblemExampleSolution
Sensitive to outliersCEO salary distorts avg company salaryUse median
Meaningless for nominal dataMean blood type is nonsenseUse mode
Inappropriate for skewed dataMean income misleadsUse median
May not be a possible valueMean family size = 2.3 childrenUse appropriate measure
Hides multimodalityMean of bimodal = between the modesVisualize first
Unbounded influenceOne extreme value shifts mean arbitrarilyUse trimmed mean or winsorized mean

The Mean in Machine Learning

MSE LossMinimizes to meanBatch NormUses batch meanEmbedding AvgMean pooling NLPStandardScalerx - mean / stdThe arithmetic mean is the foundation of loss functions, normalization, and pooling in ML/DL

The arithmetic mean is the mathematical foundation of ML:

ML ConceptHow Mean is UsedFormula
Mean Squared ErrorMinimizing → predicts the meanMSE = (1/n)Σ(yᵢ - ŷᵢ)²
Batch NormalizationStabilizes training by centering activationsx̂ = (x - μ_batch) / σ_batch
Mean Pooling (NLP)Averages token embeddingsh = (1/T)Σhₜ
StandardScalerCenters features to mean=0x_scaled = (x - μ) / σ
Weighted LossInverse-variance weightingwᵢ = 1/σᵢ²
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import make_regression
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# The mean is the minimizer of MSE
np.random.seed(42)
y_true = np.array([10, 20, 30, 40, 50, 100])  # skewed by outlier

# What value minimizes sum of squared errors?
candidates = np.linspace(0, 100, 200)
sse = [np.sum((y_true - c)**2) for c in candidates]
best = candidates[np.argmin(sse)]
print(f"Value minimizing SSE: {best:.1f}")
print(f"Mean of data: {np.mean(y_true):.1f}")
print(f"They are the same! The mean is the least squares estimator.\n")

# StandardScaler uses mean
X, y = make_regression(n_samples=100, n_features=3, noise=10, random_state=42)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
print(f"Original mean: {X.mean(axis=0).round(3)}")
print(f"Scaled mean:   {X_scaled.mean(axis=0).round(10)}")
print(f"Original std:  {X.std(axis=0).round(3)}")
print(f"Scaled std:    {X_scaled.std(axis=0).round(3)}")

Key Takeaways

Summary: Arithmetic Mean

  1. The mean is the balance point — sum of deviations always equals 0
  2. The mean is the least squares estimator of location — minimizes (xic)2\sum(x_i - c)^2
  3. Linear transformations flow directly through the mean: aX+b=aXˉ+b\overline{aX+b} = a\bar{X} + b
  4. Weighted mean accounts for unequal importance — wi=1/σi2w_i = 1/\sigma_i^2 gives minimum variance
  5. Trimmed mean provides robustness without abandoning the mean entirely
  6. The mean has unbounded influence — one extreme observation can shift it arbitrarily
  7. The mean is not always the right measure — always check your data's shape before choosing

Premium Content

Arithmetic Mean — Formula, Properties, Computation, Limitations

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement