Visualizing Categories: Bar Charts vs Pie Charts

Data Visualization

Choose the Right Chart Every Time

Bar charts and pie charts are the workhorses of categorical data visualization. Used correctly, they communicate insights instantly. Used incorrectly, they mislead. Understanding when to use each is fundamental to clear data storytelling.

Key things this concept helps with:

Comparing quantities — When you need to see which categories are larger or smaller
Showing composition — When you want to display how parts make up a whole
Avoiding misrepresentation — When you need to create honest, readable visualizations

The right chart choice can make the difference between clarity and confusion.

What is Categorical Data Visualization?

Definition

These are the workhorses of categorical data visualization. Used correctly, they communicate insights instantly. Used incorrectly, they mislead.

Bar Charts

Definition

A bar chart displays the frequency or proportion of categories using bars of proportional length. The height (or length) of each bar is proportional to the value it represents.

Best for:

Comparing quantities across categories
Showing change over discrete time periods
Ranking items

Key Properties:

Bars have equal width with gaps between them
Y-axis typically starts at zero
Categories can be sorted for easier comparison

Revenue by Category

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Sales data by product category
sales = pd.DataFrame({
    'Category': ['Electronics', 'Clothing', 'Food', 'Furniture', 'Sports'],
    'Revenue_M': [4.2, 2.8, 3.1, 1.9, 2.3],
    'Units_K': [85, 320, 450, 42, 180]
})

fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# 1. Simple vertical bar chart
bars = axes[0].bar(sales['Category'], sales['Revenue_M'],
                    color=['steelblue', 'coral', 'mediumseagreen', 'orchid', 'orange'],
                    edgecolor='black', alpha=0.8)
axes[0].set_title('Revenue by Category')
axes[0].set_ylabel('Revenue ($M)')
axes[0].tick_params(axis='x', rotation=30)
# Add value labels on bars
for bar in bars:
    axes[0].text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.05,
                 f'${bar.get_height():.1f}M', ha='center', fontsize=9)

# 2. Horizontal bar chart (better for long labels)
sales_sorted = sales.sort_values('Revenue_M', ascending=True)
axes[1].barh(sales_sorted['Category'], sales_sorted['Revenue_M'],
             color='steelblue', edgecolor='black', alpha=0.8)
axes[1].set_title('Revenue by Category\n(Sorted — better for comparison)')
axes[1].set_xlabel('Revenue ($M)')

# 3. Grouped bar chart: multiple metrics
x = np.arange(len(sales))
w = 0.35
axes[2].bar(x - w/2, sales['Revenue_M'], w, label='Revenue ($M)', color='steelblue', alpha=0.8)
axes2b = axes[2].twinx()
axes2b.bar(x + w/2, sales['Units_K']/100, w, label='Units (100K)', color='coral', alpha=0.8)
axes[2].set_title('Grouped: Revenue vs Volume')
axes[2].set_xticks(x)
axes[2].set_xticklabels(sales['Category'], rotation=30)
axes[2].legend(loc='upper left')
axes2b.legend(loc='upper right')

plt.tight_layout()
plt.savefig('bar_charts.png', dpi=150)
plt.show()

Pie Charts

Definition

A pie chart shows the proportional composition of a whole. Each slice represents a category's proportion of the total.

Best for:

Part-to-whole relationships
A small number of categories (≤ 5)
When relative proportions are the main message

Worst for:

Comparing values (bars are far better)
Many categories (becomes unreadable)
When precision matters

Mathematical Properties:

Pie Chart Slice Angle

\text{Angle}_i = \frac{f_i}{\sum f_i} \times 360°

Here,

$f_i$ =Frequency or value of category i
$\sum f_i$ =Total of all categories
$360°$ =Total degrees in a circle

Market Share

market_share = pd.DataFrame({
    'Company': ['Alpha', 'Beta', 'Gamma', 'Delta', 'Others'],
    'Share': [35, 28, 18, 12, 7]
})

fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Standard pie chart
colors = ['#2196F3', '#F44336', '#4CAF50', '#FF9800', '#9E9E9E']
wedges, texts, autotexts = axes[0].pie(
    market_share['Share'],
    labels=market_share['Company'],
    autopct='%1.1f%%',
    colors=colors,
    startangle=90,
    pctdistance=0.85
)
axes[0].set_title('Market Share (Pie Chart)')

# Donut chart (modern alternative)
wedges2, _, _ = axes[1].pie(
    market_share['Share'],
    labels=None,
    autopct='%1.1f%%',
    colors=colors,
    startangle=90,
    pctdistance=0.85,
    wedgeprops=dict(width=0.5)  # creates donut hole
)
axes[1].set_title('Market Share (Donut Chart)')
axes[1].legend(market_share['Company'], title='Company',
               loc='center left', bbox_to_anchor=(0.85, 0, 0.5, 1))

plt.tight_layout()
plt.savefig('pie_charts.png', dpi=150)
plt.show()

Bar Chart vs Pie Chart — Decision Guide

Architecture Diagram

How many categories?
+-- ≤ 5 categories AND you want to show part-of-whole
|   +-- Pie chart (or donut) <- acceptable
|
+-- Otherwise -> Bar chart (almost always better)
    +-- Comparing values -> Vertical or horizontal bar
    +-- Many categories -> Horizontal bar (labels fit)
    +-- Change over time -> Line chart (not bar)
    +-- Multiple series -> Grouped or stacked bar

Common Mistakes

Mistake	Why It's Wrong	Fix
3D bar/pie charts	Distorts areas, makes comparison impossible	Use flat 2D
Not starting y-axis at 0 (bar chart)	Makes small differences look huge	Start at 0
Too many pie slices	Impossible to compare	Use bar chart or combine small slices into "Other"
No labels/legend	Reader can't interpret	Always label
Pie chart for greater than 5 categories	Angles look similar	Use bar chart

Bar & Pie Charts in Machine Learning

In ML, bar charts are everywhere:

ML Application	Chart Type	What to Show
Class imbalance	Bar chart	Frequency of each class
Feature importance	Horizontal bar	Top features by importance
Model comparison	Grouped bar	Accuracy, F1, AUC across models
Confusion matrix	Heatmap/bar	TP, TN, FP, FN counts
Hyperparameter tuning	Bar chart	Performance across settings

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

# Generate dataset
X, y = make_classification(n_samples=1000, n_features=10, n_informative=5,
                           random_state=42)
feature_names = [f'Feature_{i}' for i in range(10)]

# Train model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Bar chart: feature importance
importance = model.feature_importances_
sorted_idx = np.argsort(importance)

plt.figure(figsize=(8, 5))
plt.barh(np.array(feature_names)[sorted_idx], importance[sorted_idx], color='steelblue')
plt.xlabel('Importance')
plt.title('Feature Importance (Random Forest)')
plt.tight_layout()
plt.show()

# Bar chart: class distribution
from collections import Counter
print("Class distribution:", Counter(y))

Key Takeaways

Bar charts are almost always better than pie charts for comparisons

Pie charts only work for 2–5 categories where part-to-whole is the message

Sort bar charts from longest to shortest for easier comparison

Never use 3D charts — they distort perception and add no information

When in doubt, choose a bar chart. It's the safer, clearer option for almost every categorical comparison.

Bar Charts and Pie Charts — When and How to Use Each

Visualizing Categories: Bar Charts vs Pie Charts

Choose the Right Chart Every Time

What is Categorical Data Visualization?

Definition

Bar Charts

Definition

Pie Charts

Definition

Pie Chart Slice Angle

Bar Chart vs Pie Chart — Decision Guide

Common Mistakes

Bar & Pie Charts in Machine Learning

Key Takeaways

Premium Content

Need Expert Statistics Help?