Deep Learning

Autoencoders — Learning Compressed Representations

Discover how autoencoders learn efficient data encodings by compressing and reconstructing inputs.

Dimensionality reduction — learn compact data representations
Feature learning — automatically discover important features
Anomaly detection — identify unusual patterns in data

Simplicity is the ultimate sophistication.

Autoencoders — Complete Guide

Autoencoders learn a compressed representation by encoding input to a bottleneck and decoding back. The key insight: if the bottleneck has fewer dimensions than the input, the network must learn the most important features.

Autoencoder Architecture

f_\theta: \mathbb{R}^d \to \mathbb{R}^k

where

k \ll d

(compression)Decoder:

g_\phi: \mathbb{R}^k \to \mathbb{R}^d

(reconstruction)Loss:

\mathcal{L}(\theta, \phi) = \mathbb{E}_{\mathbf{x}}[\|\mathbf{x} - g_\phi(f_\theta(\mathbf{x}))\|^2]

— minimize reconstruction error over training data

How autoencoders learn compressed representations: The diagram shows the information bottleneck architecture. The Encoder (blue, left) compresses a 784-dimensional input (e.g., a 28×28 MNIST image) through progressively smaller layers (784→256→128→64→32), forcing the network to learn only the most essential features. The Bottleneck (red circle) is the compressed latent representation z — just 32 dimensions that must capture everything needed to reconstruct the original image. The Decoder (green, right) mirrors the encoder, expanding from 32 back to 784 dimensions to reconstruct the input. The loss function ||x - x̂||² measures reconstruction error — the network trains to minimize the difference between input and output. The mathematical formulation at the bottom shows this as encoder f_θ mapping high-dimensional input to low-dimensional latent space, and decoder g_φ mapping back. The key insight: if the bottleneck is smaller than the input, the network must learn a compressed representation that captures the data's essential structure.

Types of Autoencoders

Variational Autoencoder (VAE)

-\mathbb{E}_{q_\phi(\mathbf{z}|\mathbf{x})}[\log p_\theta(\mathbf{x}|\mathbf{z})]

— how well decoder reconstructsKL divergence:

\text{KL}(q_\phi(\mathbf{z}|\mathbf{x}) \| p(\mathbf{z}))

— regularizes latent space to N(0,I)Closed form for Gaussian: KL = -½ Σ(1 + log σ² - μ² - σ²). Forces latent space to be smooth and centered.

Latent Space Visualization

Applications

DfAutoencoder Applications

Dimensionality Reduction: Learn nonlinear PCA — map high-dim data to low-dim latent space
Anomaly Detection: High reconstruction error = anomaly (train on normal data only)
Image Denoising: Denoising AE learns to remove noise from corrupted inputs
Image Compression: Compress to smaller latent representation than JPEG
Data Generation: VAE samples from latent space to generate new data points
Style Transfer: Encode content and style separately, combine in latent space
Drug Discovery: VAE encodes molecular structures, generates novel molecules

VAE Implementation

Example: Variational Autoencoder

class VAE(nn.Module):
    def __init__(self, input_dim=784, latent_dim=32):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256), nn.ReLU(),
            nn.Linear(256, 128), nn.ReLU()
        )
        self.mu = nn.Linear(128, latent_dim)
        self.logvar = nn.Linear(128, latent_dim)
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 128), nn.ReLU(),
            nn.Linear(128, 256), nn.ReLU(),
            nn.Linear(256, input_dim), nn.Sigmoid()
        )

    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5 * logvar)
        eps = torch.randn_like(std)
        return mu + eps * std

    def forward(self, x):
        h = self.encoder(x)
        mu, logvar = self.mu(h), self.logvar(h)
        z = self.reparameterize(mu, logvar)
        return self.decoder(z), mu, logvar

def vae_loss(x_recon, x, mu, logvar):
    recon = F.binary_cross_entropy(x_recon, x, reduction='sum')
    kl = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
    return recon + kl

Key Takeaways

Summary: Autoencoders

Autoencoders learn compressed representations via encode-decode bottleneck
VAE enables generation by learning a smooth probabilistic latent space
Denoising AE learns robust features by reconstructing clean from corrupted
Reparameterization trick enables backpropagation through sampling
VAE loss = reconstruction + KL divergence (regularizes latent space to N(0,I))
Anomaly detection: High reconstruction error indicates unusual data
Autoencoders are the foundation of diffusion models (modern generative AI)
Convolutional AE for image-specific architectures

What to Learn Next

-> Variational Autoencoders Deep dive into VAE theory and variants.

-> GANs Explore adversarial generative models.

-> Diffusion Models Learn state-of-the-art generative models.

-> Dimensionality Reduction Master PCA and other techniques.

-> Neural Networks Understand the foundation of deep learning.

-> CNNs Learn about convolutional architectures.

Autoencoders — Encoding, Decoding and Representation Learning