Advanced Topics

Meta-Learning — Learning to Learn from Few Examples

Discover meta-learning algorithms that enable models to learn new tasks quickly with minimal data. The key to few-shot learning and rapid adaptation.

MAML — Model-Agnostic Meta-Learning for fast adaptation
Prototypical Networks — Learning metric spaces for classification
Reinforcement Learning — Meta-learning with reward signals

"The most important skill is learning how to learn."

Meta-Learning — Learning to Learn

Meta-learning trains models to learn new tasks quickly from few examples.

Meta-Learning Concept

The Formal Framework

DfMeta-Learning

Meta-learning, or "learning to learn," trains a model across a distribution of tasks $\mathcal{T} \sim p(\mathcal{T})$ to acquire the ability to quickly adapt to new tasks with minimal data via a small number of gradient steps.

Bi-Level Optimization

The meta-learning objective is a bi-level optimization problem:

Outer loop (meta-update):

\theta^* = \arg\min_\theta \sum_{\mathcal{T}_i \sim p(\mathcal{T})} \mathcal{L}_{\mathcal{T}_i}(\phi_i)

Inner loop (task adaptation):

\phi_i = \theta - \alpha \nabla_\theta \mathcal{L}_{\mathcal{T}_i}(\theta)

where $\theta$ is the meta-learned initialization and $\phi_i$ is the adapted model for task $i$ .

MAML Algorithm

θ Σ_i ℒ{T_i}(φ_i)Meta-Initialize θLearned across all tasksTask 1: Cat vs DogSupport Set (5-shot):5 labeled examples per classInner Loop (1-5 steps):φᵢ = θ − α∇_θ ℒ_{T_i}(θ)Adapt to task iQuery Set:Unseen examples for meta-lossTask 2: Hot vs ColdSupport Set (5-shot):5 labeled examples per classInner Loop (1-5 steps):φᵢ = θ − α∇_θ ℒ_{T_i}(θ)Adapt to task iQuery Set:Unseen examples for meta-lossTask 3: Red vs BlueSupport Set (5-shot):5 labeled examples per classInner Loop (1-5 steps):φᵢ = θ − α∇_θ ℒ_{T_i}(θ)Adapt to task iQuery Set:Unseen examples for meta-lossTask K: ...Support Set (5-shot):5 labeled examples per classInner Loop (1-5 steps):φᵢ = θ − α∇_θ ℒ_{T_i}(θ)Adapt to task iQuery Set:Unseen examples for meta-lossMeta-loss = Σ_i ℒ_{T_i}(φ_i) → Update θ via β

MAML Algorithm Details

DfMAML (Model-Agnostic Meta-Learning)

An optimization-based meta-learning algorithm that finds a model initialization $\theta$ that can be rapidly adapted to new tasks with a few gradient steps.

MAML Pseudocode

Architecture Diagram

Algorithm: MAML
Input: Task distribution p(T), step sizes α (inner), β (outer)
Output: Meta-learned initialization θ

1: Randomly initialize θ
2: for each meta-training iteration do
3:   Sample batch of tasks Tᵢ ~ p(T)
4:   for each task Tᵢ do
5:     Sample support set Sᵢ = {(xₐ, yₐ)} from Tᵢ
6:     Compute gradient: gᵢ = ∇_θ ℒ_{Tᵢ}(θ; Sᵢ)
7:     Adapt: φᵢ = θ − α · gᵢ              // Inner loop
8:   end for
9:   Sample query sets Qᵢ from each Tᵢ
10:  Compute meta-gradient: ∇_θ Σᵢ ℒ_{Tᵢ}(φᵢ; Qᵢ)
11:  Update: θ ← θ − β · ∇_θ Σᵢ ℒ_{Tᵢ}(φᵢ; Qᵢ)  // Outer loop
12: end for

MAML Variants

First-Order MAML (FOMAML): Approximate meta-gradient by ignoring second-order terms — faster, nearly as good
Reptile: Simply average multiple SGD runs across tasks — no second-order gradients needed
MAML++: Uses learned inner loop learning rates, more inner steps, and data augmentation
ANIL (Almost No Inner Loop): Only the head (last layer) is adapted in the inner loop — challenges necessity of full MAML

Prototypical Networks

Few-Shot Learning Scenarios

DfN-way K-shot Classification

In N-way K-shot classification, the model must distinguish between $N$ classes using only $K$ labeled examples per class. This is the standard evaluation protocol for meta-learning.

Learning Scenarios

Few-Shot Learning (N-way, K-shot):

$N$ classes, $K$ support examples per class ( $K \leq 5$ typically)
Query set: unlabeled examples to classify
$K=1$ : One-shot learning, $K=5$ : 5-shot learning

Zero-Shot Learning:

Classes never seen during training
Requires semantic descriptions or attributes
Example: "zebra" described as "horse with stripes"

Few-Shot with Side Information:

Use text descriptions, attributes, or images of similar classes
Multimodal meta-learning

Key Takeaways

Summary: Meta-Learning

Meta-learning enables few-shot learning — adapt to new tasks with 5 examples
MAML finds initialization $\theta$ for fast gradient-based adaptation
Prototypical Networks learn metric spaces — classify by nearest prototype
Episodic training simulates few-shot scenarios: support + query sets
Bi-level optimization: Outer loop optimizes initialization, inner loop adapts to task
Applications: robotics, personalization, drug discovery, NLP
Transfer learning is simpler but less flexible
Neural architecture search is meta-learning for architectures
FOMAML, Reptile, ANIL are practical alternatives to full MAML

What to Learn Next

-> Self-Supervised Learning — Pre-training Revolution Learn about self-supervised learning — pre-training revolution.

-> Transfer Learning — Pre-trained Models Complete Guide Learn about transfer learning — pre-trained models complete guide.

-> Neural Networks Fundamentals — Perceptrons to Deep Learning Learn about neural networks fundamentals — perceptrons to deep learning.

-> Model Evaluation — Metrics, Cross-Validation and Selection Learn about model evaluation — metrics, cross-validation and selection.

-> AutoML — Automated Machine Learning Learn about automl — automated machine learning.

-> ML System Design — Architecture and Production Patterns Learn about ml system design — architecture and production patterns.

Meta-Learning — Learning to Learn

Meta-Learning — Learning to Learn from Few Examples

Meta-Learning — Learning to Learn

Meta-Learning Concept

The Formal Framework

DfMeta-Learning

MAML Algorithm

MAML Algorithm Details

DfMAML (Model-Agnostic Meta-Learning)

Prototypical Networks

Few-Shot Learning Scenarios

DfN-way K-shot Classification

Key Takeaways

Summary: Meta-Learning

What to Learn Next

Premium Content

Need Expert Machine Learning Help?