🎉 75% of content is free forever — Unlock Premium from $10/mo →
CW
Search courses…
💼 Servicesℹ️ About✉️ ContactView Pricing Plansfrom $10

Autoencoders — Encoding, Decoding and Representation Learning

Deep LearningAutoencoders🟢 Free Lesson

Advertisement

Deep Learning

Autoencoders — Learning Compressed Representations

Discover how autoencoders learn efficient data encodings by compressing and reconstructing inputs.

  • Dimensionality reduction — learn compact data representations
  • Feature learning — automatically discover important features
  • Anomaly detection — identify unusual patterns in data

Simplicity is the ultimate sophistication.

Autoencoders — Complete Guide

Autoencoders learn a compressed representation by encoding input to a bottleneck and decoding back. The key insight: if the bottleneck has fewer dimensions than the input, the network must learn the most important features.


Autoencoder Architecture

Autoencoder ArchitectureInputx784dimsEncoderq(z|x)784 → 256256 → 128128 → 6464 → 32Compresses: 784 → 32BottleneckLatent Space zz32 dimsCompressedrepresentationDecoderp(x|z)32 → 6464 → 128128 → 256256 → 784Reconstructs: 32 → 784Output784dimsLoss||x - x̂||²MSE reconstructionMathematical FormulationEncoder: fθ:RdRkf_\theta: \mathbb{R}^d \to \mathbb{R}^k where kdk \ll d (compression)Decoder: gϕ:RkRdg_\phi: \mathbb{R}^k \to \mathbb{R}^d (reconstruction)Loss: L(θ,ϕ)=Ex[xgϕ(fθ(x))2]\mathcal{L}(\theta, \phi) = \mathbb{E}_{\mathbf{x}}[\|\mathbf{x} - g_\phi(f_\theta(\mathbf{x}))\|^2] — minimize reconstruction error over training data

How autoencoders learn compressed representations: The diagram shows the information bottleneck architecture. The Encoder (blue, left) compresses a 784-dimensional input (e.g., a 28×28 MNIST image) through progressively smaller layers (784→256→128→64→32), forcing the network to learn only the most essential features. The Bottleneck (red circle) is the compressed latent representation z — just 32 dimensions that must capture everything needed to reconstruct the original image. The Decoder (green, right) mirrors the encoder, expanding from 32 back to 784 dimensions to reconstruct the input. The loss function ||x - x̂||² measures reconstruction error — the network trains to minimize the difference between input and output. The mathematical formulation at the bottom shows this as encoder f_θ mapping high-dimensional input to low-dimensional latent space, and decoder g_φ mapping back. The key insight: if the bottleneck is smaller than the input, the network must learn a compressed representation that captures the data's essential structure.


Types of Autoencoders

Autoencoder VariantsVanilla AE• Deterministic encoder• MSE reconstruction loss• No generative capability• Good for dimensionality reductionUse: PCA alternative,feature extractionx → Encoder → z → Decoder → x̂VAE• Probabilistic encoder• KL + reconstruction loss• Can GENERATE new data• Smooth latent space• Reparameterization trickUse: Generation,anomaly detectionx → μ,σ → z~N(μ,σ) → x̂Denoising AE• Input corrupted with noise• Output is CLEAN version• Learns robust features• Prevents identity learning• Stronger featuresUse: Denoising, robustfeature learningx+noise → Encoder → z → x̂(clean)Conv AE• Conv layers in encoder• Transposed conv in decoder• Preserves spatial structure• Better for images• Parameter efficientUse: Image compression,image denoisingimg → Conv → z → DeConv → imĝ

Variational Autoencoder (VAE)

VAE Architecture and TrainingxinputEncoderqφ(z|x)μlog σ²outputs μ, log σ²Reparameterizez = μ + σ⊙εε ~ N(0, I)Decoderpθ(x|z)Generates x̂from zoutputGeneration (at inference)1. Sample z ~ N(0, I)2. Decode: x̂ = decoder(z)3. New realistic sample!N(0,I) → Decoder → Generated SampleVAE Loss Function (ELBO)L = -E_q[log p(x|z)] + KL(q(z|x) || p(z))Reconstruction loss:Eqϕ(zx)[logpθ(xz)]-\mathbb{E}_{q_\phi(\mathbf{z}|\mathbf{x})}[\log p_\theta(\mathbf{x}|\mathbf{z})] — how well decoder reconstructsKL divergence:KL(qϕ(zx)p(z))\text{KL}(q_\phi(\mathbf{z}|\mathbf{x}) \| p(\mathbf{z})) — regularizes latent space to N(0,I)Closed form for Gaussian: KL = -½ Σ(1 + log σ² - μ² - σ²). Forces latent space to be smooth and centered.

Latent Space Visualization

VAE Latent Space PropertiesVanilla AE: GapsClusters with gaps — cannot generateVAE: SmoothContinuous — interpolate between pointsInterpolationz₁ → z₂ generates smooth transition

Applications

DfAutoencoder Applications

  1. Dimensionality Reduction: Learn nonlinear PCA — map high-dim data to low-dim latent space
  2. Anomaly Detection: High reconstruction error = anomaly (train on normal data only)
  3. Image Denoising: Denoising AE learns to remove noise from corrupted inputs
  4. Image Compression: Compress to smaller latent representation than JPEG
  5. Data Generation: VAE samples from latent space to generate new data points
  6. Style Transfer: Encode content and style separately, combine in latent space
  7. Drug Discovery: VAE encodes molecular structures, generates novel molecules

VAE Implementation

Example: Variational Autoencoder

class VAE(nn.Module):
    def __init__(self, input_dim=784, latent_dim=32):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256), nn.ReLU(),
            nn.Linear(256, 128), nn.ReLU()
        )
        self.mu = nn.Linear(128, latent_dim)
        self.logvar = nn.Linear(128, latent_dim)
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 128), nn.ReLU(),
            nn.Linear(128, 256), nn.ReLU(),
            nn.Linear(256, input_dim), nn.Sigmoid()
        )

    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5 * logvar)
        eps = torch.randn_like(std)
        return mu + eps * std

    def forward(self, x):
        h = self.encoder(x)
        mu, logvar = self.mu(h), self.logvar(h)
        z = self.reparameterize(mu, logvar)
        return self.decoder(z), mu, logvar

def vae_loss(x_recon, x, mu, logvar):
    recon = F.binary_cross_entropy(x_recon, x, reduction='sum')
    kl = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
    return recon + kl

Key Takeaways

Summary: Autoencoders

  • Autoencoders learn compressed representations via encode-decode bottleneck
  • VAE enables generation by learning a smooth probabilistic latent space
  • Denoising AE learns robust features by reconstructing clean from corrupted
  • Reparameterization trick enables backpropagation through sampling
  • VAE loss = reconstruction + KL divergence (regularizes latent space to N(0,I))
  • Anomaly detection: High reconstruction error indicates unusual data
  • Autoencoders are the foundation of diffusion models (modern generative AI)
  • Convolutional AE for image-specific architectures

What to Learn Next

-> Variational Autoencoders Deep dive into VAE theory and variants.

-> GANs Explore adversarial generative models.

-> Diffusion Models Learn state-of-the-art generative models.

-> Dimensionality Reduction Master PCA and other techniques.

-> Neural Networks Understand the foundation of deep learning.

-> CNNs Learn about convolutional architectures.

Premium Content

Autoencoders — Encoding, Decoding and Representation Learning

Unlock this lesson and 900+ advanced tutorials with a Premium plan.

🎯End-to-end Projects
💼Interview Prep
📜Certificates
🤝Community Access

Already a member? Log in

Need Expert Machine Learning Help?

Get personalized tutoring, project support, or professional consulting.

Advertisement