Scatter Plots
Data Visualization
Discover Relationships Hidden Inside Data
Scatter plots are the most powerful visualization tool for understanding how two numerical variables interact with each other. By plotting individual data points on a two-dimensional plane, you can instantly see patterns that summary statistics alone would miss.
Each point represents one observation, and together they reveal the story your data is telling. Here is what scatter plots help you discover:
- Correlation β Whether variables move together (positive), in opposite directions (negative), or show no relationship at all.
- Trends β The general direction data follows, such as increasing, decreasing, or staying flat as one variable changes.
- Outliers β Unusual points that sit far from the main cloud, often indicating data errors, rare events, or hidden subgroups.
- Clusters β Natural groupings in your data that may reveal distinct categories or segments you did not know existed.
- Nonlinear Patterns β Curves, U-shapes, or fan-shaped spreads that a single correlation number cannot capture.
Before calculating any statistical metric, always visualize the data. A scatter plot takes seconds to create but can save you from drawing the wrong conclusion.
What is a Scatter Plot?
Definition
A scatter plot visualizes the relationship between two continuous variables by plotting each observation as a point in two-dimensional space.
The position of each point reveals whether variables move together, move in opposite directions, or show no relationship at all.
How Scatter Plots Generate Insights
The Four Things to Look For
Direction
Positive, Negative, or No Relationship
Strength
Weak, Moderate, or Strong Association
Form
Linear, Curved, or Clustered Pattern
Outliers
Points Far From the Main Cloud
Interactive Scatter Plot
Correlation Strength Scale
Common Scatter Plot Patterns
Strong Positive
β’
β’
β’
β’
β’
β’
r β +0.95
Strong Negative
β’
β’
β’
β’
β’
r β -0.95
No Relationship
β’ β’
β’
β’
β’
r β 0
Nonlinear
β’ β’
β’ β’
β’
β’ β’
β’ β’
Pearson r can be misleading.
Outlier Effect
β’
β’
β’
β’
X
One point can change everything.
Clustered Data
β’β’β’
β’β’β’
β’β’β’
β’β’β’
May indicate hidden groups.
Important Warning
Correlation Does Not Imply Causation
A scatter plot may reveal association, but it cannot prove that one variable causes another.
Always investigate:
- Confounding variables
- Reverse causation
- Random coincidence
Real-World Applications
| Industry | Example |
|---|---|
| Finance | Risk vs Return |
| Healthcare | Age vs Blood Pressure |
| Education | Study Hours vs Exam Scores |
| Marketing | Ad Spend vs Sales |
| Manufacturing | Temperature vs Defect Rate |
Key Takeaways
Visualize before calculating correlation
Examine direction and strength
Identify outliers early
Pearson r only measures linear relationships
The golden rule of statistics:
Always visualize before you quantify.