Python Debugging — Finding & Fixing Bugs

Debugging is the art of finding and fixing bugs. Python provides excellent tools for systematic debugging, from interactive debuggers to logging frameworks. Good debugging skills are essential for every developer.

Learning Objectives

Use pdb and breakpoint() for interactive debugging
Apply logging for production debugging
Use assertions for defensive programming
Follow systematic debugging workflows
Identify and fix common bug patterns
Debug memory leaks and race conditions

breakpoint() — Built-in Debugger

Python 3.7+ includes breakpoint() which drops you into pdb:

def calculate_discount(price, discount):
    breakpoint()  # Execution stops here
    discounted = price * (1 - discount)
    return discounted

result = calculate_discount(100, 0.2)

pdb Commands Reference

Architecture Diagram

# When breakpoint() hits, you get a (Pdb) prompt:

n (next)        — execute next line (step over)
s (step)        — step into function call
c (continue)    — run until next breakpoint
r (return)      — run until current function returns
p var           — print variable value
pp var          — pretty-print variable
l (list)        — show source code around current line
w (where)       — print stack trace
u (up)          — move up in call stack
d (down)        — move down in call stack
q (quit)        — exit debugger
h (help)        — show help
!command        — execute Python command

Practical Debugging Example

def process_order(order):
    breakpoint()  # Start debugging here

    total = 0
    for item in order['items']:
        price = item['price']
        quantity = item['quantity']
        discount = item.get('discount', 0)

        # Debug: check each calculation
        subtotal = price * quantity * (1 - discount)
        print(f"Item: {item['name']}, Subtotal: {subtotal}")

        total += subtotal

    # Apply tax
    tax = total * order.get('tax_rate', 0.08)
    total += tax

    return total

# At (Pdb) prompt:
# p order          — inspect the order dict
# p item           — inspect current item
# p price          — check price value
# n                — step to next line
# c                — continue execution

Conditional Breakpoints

def find_problematic_user(users):
    for user in users:
        if user['age'] < 0:
            breakpoint()  # Only stops when condition is true
        process_user(user)

# Or set in pdb:
# b 15    — set breakpoint at line 15
# b 15, user['age'] < 0  — conditional breakpoint
# cl 15   — clear breakpoint at line 15

Print Debugging

Simple but effective for quick investigation:

def calculate_average(numbers):
    print(f"[DEBUG] Input numbers: {numbers}")

    total = 0
    for i, num in enumerate(numbers):
        total += num
        print(f"[DEBUG] Step {i}: num={num}, total={total}")

    average = total / len(numbers)
    print(f"[DEBUG] Final: total={total}, count={len(numbers)}, average={average}")
    return average

# Using repr() for clearer output
def debug_variable(var):
    print(f"[DEBUG] {var = }")  # Python 3.8+ f-string debugging
    print(f"[DEBUG] {repr(var) = }")
    print(f"[DEBUG] Type: {type(var)}, Value: {var}")

Debug Helper Function

def debug(label, value):
    """Reusable debug helper."""
    import inspect
    frame = inspect.currentframe().f_back
    print(f"[{label}] {type(value).__name__}: {value!r}")

def process_data(data):
    filtered = [x for x in data if x > 0]
    debug("filtered", filtered)

    total = sum(filtered)
    debug("total", total)

    average = total / len(filtered) if filtered else 0
    debug("average", average)

    return average

Logging for Debugging

Production-ready debugging with the logging module:

import logging

# Configure logging
logging.basicConfig(
    level=logging.DEBUG,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('debug.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

def process_order(order_id):
    logger.debug(f"Processing order {order_id}")

    order = get_order(order_id)
    logger.debug(f"Order details: {order}")

    if not order:
        logger.warning(f"Order {order_id} not found")
        return None

    total = calculate_total(order)
    logger.info(f"Order {order_id} total: ${total:.2f}")

    if total > 1000:
        logger.info(f"Large order {order_id}: ${total:.2f}")

    return total

Logging Levels

import logging

logging.basicConfig(level=logging.DEBUG)

logger = logging.getLogger(__name__)

# Different levels for different purposes
logger.debug("Detailed information for debugging")
logger.info("General information about program execution")
logger.warning("Something unexpected happened")
logger.error("Something failed, but program continues")
logger.critical("Program cannot continue")

# Exception logging
try:
    result = 1 / 0
except ZeroDivisionError:
    logger.exception("Error occurred")  # Includes traceback
    # Or: logger.error("Error occurred", exc_info=True)

Structured Logging

import logging
import json
from datetime import datetime

class JSONFormatter(logging.Formatter):
    """Format logs as JSON for easy parsing."""

    def format(self, record):
        log_data = {
            'timestamp': datetime.utcnow().isoformat(),
            'level': record.levelname,
            'message': record.getMessage(),
            'module': record.module,
            'function': record.funcName,
            'line': record.lineno
        }

        if record.exc_info:
            log_data['exception'] = self.formatException(record.exc_info)

        return json.dumps(log_data)

def setup_json_logging():
    handler = logging.StreamHandler()
    handler.setFormatter(JSONFormatter())
    logging.root.addHandler(handler)
    logging.root.setLevel(logging.DEBUG)

# Usage
setup_json_logging()
logger = logging.getLogger(__name__)
logger.info("User logged in", extra={'user_id': 123})

Assertions for Defensive Programming

Assertions catch bugs early during development:

def divide(a, b):
    """Divide a by b with assertion checks."""
    assert isinstance(a, (int, float)), f"Expected number, got {type(a)}"
    assert isinstance(b, (int, float)), f"Expected number, got {type(b)}"
    assert b != 0, "Division by zero"
    return a / b

def process_list(items):
    """Process a list of items safely."""
    assert isinstance(items, list), f"Expected list, got {type(items)}"
    assert len(items) > 0, "List cannot be empty"
    assert all(isinstance(item, (int, float)) for item in items), "All items must be numbers"

    return sum(items) / len(items)

# Disable assertions in production with: python -O script.py

Custom Exception Classes

class ValidationError(Exception):
    """Custom exception for validation errors."""
    pass

class User:
    def __init__(self, name, email, age):
        if not name:
            raise ValidationError("Name is required")
        if not email or '@' not in email:
            raise ValidationError(f"Invalid email: {email}")
        if age < 0 or age > 150:
            raise ValidationError(f"Invalid age: {age}")

        self.name = name
        self.email = email
        self.age = age

# Usage
try:
    user = User("Alice", "alice@example.com", 30)
except ValidationError as e:
    print(f"Validation failed: {e}")

Systematic Debugging Workflow

Follow this process to find and fix bugs:

# 1. REPRODUCE — Create minimal reproduction case
def buggy_function(data):
    # Bug happens with specific input
    return process(data)

# Test with minimal input that triggers bug
result = buggy_function([1, 2, -1, 3])  # Bug occurs here

# 2. ISOLATE — Narrow down the problem
def buggy_function(data):
    print(f"Input: {data}")  # Check input

    filtered = [x for x in data if x > 0]
    print(f"Filtered: {filtered}")  # Check intermediate result

    result = sum(filtered) / len(filtered)  # Bug might be here
    print(f"Result: {result}")

    return result

# 3. UNDERSTAND — Read error message and traceback
try:
    buggy_function([])
except ZeroDivisionError as e:
    import traceback
    traceback.print_exc()  # Full traceback

# 4. FIX — Apply the fix
def buggy_function(data):
    filtered = [x for x in data if x > 0]
    if not filtered:
        return 0  # Handle empty list
    return sum(filtered) / len(filtered)

# 5. TEST — Verify the fix
assert buggy_function([1, 2, -1, 3]) == 2.0
assert buggy_function([]) == 0
assert buggy_function([-1, -2]) == 0

Using repr() for Better Debugging

class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email

    def __repr__(self):
        return f"User(name={self.name!r}, email={self.email!r})"

    def __str__(self):
        return f"{self.name} <{self.email}>"

# Without __repr__, debugging shows: <__main__.User object at 0x...>
# With __repr__, debugging shows: User(name='Alice', email='alice@example.com')
user = User("Alice", "alice@example.com")
print(repr(user))  # User(name='Alice', email='alice@example.com')

Debugging Memory Issues

import sys
import tracemalloc

# Check memory usage of objects
def check_memory():
    data = [i for i in range(1000000)]
    print(f"List size: {sys.getsizeof(data)} bytes")
    print(f"Per element: {sys.getsizeof(data[0])} bytes")

# Profile memory with tracemalloc
def profile_memory():
    tracemalloc.start()

    # Code to profile
    data = [i * 2 for i in range(1000000)]
    filtered = [x for x in data if x % 3 == 0]

    snapshot = tracemalloc.take_snapshot()
    stats = snapshot.statistics('lineno')

    print("[ Top 5 memory users ]")
    for stat in stats[:5]:
        print(stat)

# Find memory leaks
class Node:
    def __init__(self, value):
        self.value = value
        self.next = None

# Memory leak: circular reference
def create_leak():
    a = Node(1)
    b = Node(2)
    a.next = b
    b.next = a  # Circular reference!
    return a

# Fix: use weakref or explicitly break references
import weakref

class NodeFixed:
    def __init__(self, value):
        self.value = value
        self._next = None

    @property
    def next(self):
        return self._next

    @next.setter
    def next(self, node):
        self._next = weakref.ref(node) if node else None

Debugging Race Conditions

import threading
import time

# Race condition example
counter = 0

def unsafe_increment():
    global counter
    for _ in range(100000):
        temp = counter
        counter = temp + 1  # Race condition!

# Debug with logging
def debug_increment(name):
    global counter
    for _ in range(100):
        temp = counter
        time.sleep(0.0001)  # Expose race condition
        counter = temp + 1
        logging.debug(f"{name}: counter = {counter}")

# Fix with lock
lock = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100000):
        with lock:
            counter += 1

Debugging with Thread Names

import threading
import logging

logging.basicConfig(level=logging.DEBUG, format='%(threadName)s - %(message)s')

def worker(name):
    logging.info(f"Starting {name}")
    time.sleep(1)
    logging.info(f"Completed {name}")

threads = [
    threading.Thread(target=worker, args=(f"Worker-{i}",), name=f"Thread-{i}")
    for i in range(3)
]

for t in threads:
    t.start()
for t in threads:
    t.join()

Real-World Examples

Example 1: Debugging a Slow Function

import time
import cProfile

def slow_function():
    """Function with performance issues."""
    data = []
    for i in range(100000):
        data.append(i * 2)

    filtered = []
    for item in data:
        if item % 3 == 0:
            filtered.append(item)

    total = 0
    for item in filtered:
        total += item

    return total

# Profile to find bottleneck
cProfile.run('slow_function()')

# Fix: use list comprehensions and built-in functions
def fast_function():
    data = [i * 2 for i in range(100000)]
    filtered = [x for x in data if x % 3 == 0]
    return sum(filtered)

Example 2: Debugging Network Issues

import requests
import logging

logging.basicConfig(level=logging.DEBUG)

def fetch_with_debug(url):
    """Fetch URL with detailed debugging."""
    try:
        logging.debug(f"Making request to {url}")
        response = requests.get(url, timeout=5)
        logging.debug(f"Response status: {response.status_code}")
        logging.debug(f"Response headers: {dict(response.headers)}")

        response.raise_for_status()
        return response.json()

    except requests.exceptions.Timeout:
        logging.error(f"Request to {url} timed out")
        raise
    except requests.exceptions.ConnectionError as e:
        logging.error(f"Connection error: {e}")
        raise
    except requests.exceptions.HTTPError as e:
        logging.error(f"HTTP error: {e}")
        raise
    except Exception as e:
        logging.exception(f"Unexpected error: {e}")
        raise

# Usage
try:
    data = fetch_with_debug("https://api.example.com/data")
except Exception as e:
    print(f"Failed to fetch data: {e}")

Example 3: Debugging Database Queries

import sqlite3
import logging

class DebugCursor:
    """Wrapper that logs SQL queries."""

    def __init__(self, cursor):
        self.cursor = cursor

    def execute(self, query, params=None):
        logging.debug(f"SQL: {query}")
        if params:
            logging.debug(f"Params: {params}")
        start = time.time()
        result = self.cursor.execute(query, params or ())
        elapsed = time.time() - start
        logging.debug(f"Query took {elapsed:.4f}s")
        return result

    def __getattr__(self, name):
        return getattr(self.cursor, name)

def debug_database():
    conn = sqlite3.connect('app.db')
    cursor = DebugCursor(conn.cursor())

    # All queries will be logged
    cursor.execute("SELECT * FROM users WHERE age > ?", (25,))
    users = cursor.fetchall()
    logging.debug(f"Found {len(users)} users")

Common Mistakes

Mistake	Problem	Solution
Using print for production debugging	No log levels, no persistence	Use logging module
Not reading error messages	Miss obvious solutions	Read traceback carefully
Debugging in production	Affects users	Use logging and monitoring
Not reproducing first	Can't verify fix	Create minimal reproduction
Changing multiple things at once	Can't identify what fixed it	Change one thing at a time
Not adding regression test	Bug may return	Add test after fixing

Best Practices

# 1. Use logging instead of print
import logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger(__name__)

def process(data):
    logger.debug(f"Processing: {data}")
    result = transform(data)
    logger.debug(f"Result: {result}")
    return result

# 2. Use breakpoint() for interactive debugging
def complex_function(data):
    result = step1(data)
    breakpoint()  # Inspect state here
    return step2(result)

# 3. Use assertions for invariants
def process_order(order):
    assert order is not None, "Order cannot be None"
    assert len(order['items']) > 0, "Order must have items"
    # ... process order

# 4. Use repr() for debugging output
class MyClass:
    def __repr__(self):
        return f"MyClass(attr={self.attr!r})"

# 5. Log at appropriate levels
logger.debug("Detailed info")      # Development only
logger.info("General info")        # Normal operation
logger.warning("Something odd")    # Unexpected but handled
logger.error("Something failed")   # Operation failed
logger.critical("System failure")  # Program cannot continue

Key Takeaways

Use breakpoint() for interactive debugging — it's built into Python 3.7+
Use logging instead of print for production code — it provides levels, timestamps, and persistence
Assertions catch bugs early in development — use them to verify assumptions
Follow the reproduce -> isolate -> understand -> fix -> test workflow
Read error messages and tracebacks carefully — they usually tell you exactly what's wrong
Use repr() for clearer debug output of custom objects
Add regression tests after fixing bugs to prevent them from returning

Python Debugging — Finding & Fixing Bugs

Python Debugging — Finding & Fixing Bugs

Learning Objectives

breakpoint() — Built-in Debugger

pdb Commands Reference

Practical Debugging Example

Conditional Breakpoints

Print Debugging

Debug Helper Function

Logging for Debugging

Logging Levels

Structured Logging

Assertions for Defensive Programming

Custom Exception Classes

Systematic Debugging Workflow

Using repr() for Better Debugging

Debugging Memory Issues

Debugging Race Conditions

Debugging with Thread Names

Real-World Examples

Example 1: Debugging a Slow Function

Example 2: Debugging Network Issues

Example 3: Debugging Database Queries

Common Mistakes

Best Practices

Key Takeaways

Premium Content

Need Expert Python Help?