🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Odds Ratios — Understanding and Interpreting ORs

Regression AnalysisLogistic Regression🟢 Free Lesson

Advertisement

Odds Ratios

Regression Analysis

Interpreting Associations in Binary Outcomes

Odds ratios quantify the strength of association between exposures and binary outcomes. They are the primary effect measure in logistic regression and case-control studies, providing intuitive multiplicative comparisons.

  • Epidemiology — Measure risk factor associations in disease studies

  • Clinical Trials — Report treatment effects on binary endpoints

  • Social Sciences — Quantify how factors like education affect binary decisions

An odds ratio of 2 means the odds double — simple interpretation with profound implications.


The odds ratio (OR) measures the association between a binary predictor and a binary outcome. It is the ratio of two odds.

Odds Ratio

OR=p1/(1p1)p2/(1p2)OR = \frac{p_1/(1-p_1)}{p_2/(1-p_2)}

Here,

  • OROR=Odds ratio
  • p1p_1=Probability of outcome in group 1
  • p2p_2=Probability of outcome in group 2

import numpy as np

import pandas as pd

from scipy import stats



# 2×2 contingency table

# Smoking vs Heart Disease

data = np.array([[80, 120],   # Smokers: disease, no disease

                 [30, 270]])  # Non-smokers: disease, no disease



smoker_odds = data[0,0] / data[0,1]

nonsmoker_odds = data[1,0] / data[1,1]

OR = smoker_odds / nonsmoker_odds



print("Smoking and Heart Disease:")

print(f"  Smokers: {data[0,0]} disease, {data[0,1]} no disease -> odds = {smoker_odds:.3f}")

print(f"  Non-smokers: {data[1,0]} disease, {data[1,1]} no disease -> odds = {nonsmoker_odds:.3f}")

print(f"  Odds Ratio = {OR:.3f}")

print(f"  Smokers have {OR:.1f}× the odds of heart disease vs non-smokers")



# 95% CI for OR (log-method)

log_OR = np.log(OR)

SE_log_OR = np.sqrt(sum(1/x for x in data.flatten()))

CI_lower = np.exp(log_OR - 1.96*SE_log_OR)

CI_upper = np.exp(log_OR + 1.96*SE_log_OR)

print(f"  95% CI: ({CI_lower:.3f}, {CI_upper:.3f})")



# Fisher's exact test

oddsratio_fisher, p_fisher = stats.fisher_exact(data)

print(f"  Fisher's exact p-value: {p_fisher:.6f}")



# OR from logistic regression

import statsmodels.api as sm

np.random.seed(42)

n = 500

smoking = np.random.binomial(1, 0.4, n)

heart_disease = np.random.binomial(1, 0.1 + 0.2*smoking)



X = sm.add_constant(smoking)

logit_model = sm.Logit(heart_disease, X).fit(disp=False)

or_logit = np.exp(logit_model.params['x1'])

ci_logit = np.exp(logit_model.conf_int().loc['x1'])

print(f"\nOR from logistic regression: {or_logit:.3f}")

print(f"95% CI: ({ci_logit[0]:.3f}, {ci_logit[1]:.3f})")



# OR vs Risk Ratio

p1 = data[0,0] / data[0].sum()

p2 = data[1,0] / data[1].sum()

RR = p1 / p2

print(f"\nOR = {OR:.3f}, Risk Ratio (RR) = {RR:.3f}")

print("OR overestimates effect when outcome is common (>10%)")

print("Use RR for cohort studies, OR for case-control studies")

OR vs RR

OR overestimates the effect when the outcome is common (>10%). Use RR for cohort studies and RCTs; use OR for case-control studies.


Key Takeaways

Summary: Odds Ratios

  • OR = 1: no association; OR greater than 1: positive association; OR < 1: negative association

  • OR ˜ RR only when the outcome is rare (<10%)

  • Log(OR) from logistic regression gives the coefficient

  • 95% CI not including 1 means statistically significant association

  • Case-control studies use OR; cohort/RCT studies can use either RR or OR

Premium Content

Odds Ratios — Understanding and Interpreting ORs

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Statistics Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement