ML Foundations

The Science of Getting Computers to Learn from Data

Machine learning is transforming every industry — from healthcare to finance to autonomous vehicles. Understanding the fundamentals is the first step to building intelligent systems.

Supervised Learning — Learn from labeled data to make predictions
Unsupervised Learning — Discover hidden patterns in unlabeled data
The ML Workflow — A systematic approach from problem definition to deployment

"Machine learning is the last invention that humanity will ever need to make."

What is Machine Learning? — Complete Introduction

Machine Learning is the science of getting computers to learn from data without being explicitly programmed. This tutorial provides a comprehensive foundation for your entire ML journey.

What is Machine Learning?

DfMachine Learning

Machine Learning is a branch of artificial intelligence that enables systems to learn and improve from experience without being explicitly programmed. Formally, a computer program is said to learn from experience $E$ with respect to some task $T$ and performance measure $P$ , if its performance at task $T$ , as measured by $P$ , improves with experience $E$ (Mitchell, 1997).

Traditional Programming vs Machine Learning

Traditional: Input Data + Rules → Output

ML: Input Data + Output → Rules (Model)

Example: Email spam — instead of writing rules, we show examples and let the algorithm learn

How ML reverses traditional programming: The top half shows traditional programming: a human explicitly writes rules (if-else statements) that transform input data into outputs. For email spam filtering, you'd write rules like "if email contains 'free money', mark as spam." The bottom half shows the ML approach: instead of writing rules, you provide examples (labeled emails) and the algorithm automatically discovers the rules. The red "Learned Rules (Model)" box represents what the ML algorithm produces — a mathematical function that maps inputs to outputs. The text at the bottom summarizes the paradigm shift: traditional = Data + Rules → Output; ML = Data + Output → Rules. This is powerful because the learned rules can capture patterns too complex for humans to specify manually — like recognizing spam based on thousands of subtle features simultaneously.

Types of Machine Learning

Supervised Learning

DfSupervised Learning

Given a training set $\mathcal{D} = \{(x^{(i)}, y^{(i)})\}_{i=1}^{N}$ where $x^{(i)} \in \mathbb{R}^d$ are input features and $y^{(i)}$ are labels, supervised learning seeks a function $f: \mathbb{R}^d \to \mathcal{Y}$ that maps inputs to outputs while minimizing expected loss $\mathbb{E}_{(x,y) \sim P_{data}}[\mathcal{L}(f(x), y)]$ .

Unsupervised Learning

DfUnsupervised Learning

Given unlabeled data $\mathcal{D} = \{x^{(i)}\}_{i=1}^{N}$ , unsupervised learning seeks to discover the underlying structure $P(x)$ or low-dimensional representations. This includes clustering, dimensionality reduction, density estimation, and generative modeling.

Reinforcement Learning

DfReinforcement Learning

An agent interacts with an environment in discrete time steps, observing state $s_t$ , taking action $a_t$ , and receiving reward $r_t$ . The goal is to learn a policy $\pi: S \to A$ that maximizes the expected cumulative discounted reward: $G_t = \sum_{k=0}^{\infty} \gamma^k r_{t+k}$ where $\gamma \in [0,1)$ is the discount factor.

ML Algorithm Taxonomy

Key Applications

The ML Workflow

Why It Matters

Understanding the ML workflow is essential because it provides a systematic approach to solving problems with data. Each step builds on the previous one, and skipping steps often leads to poor model performance. The workflow is inherently iterative — expect to revisit earlier stages as you gain insights.

Key Concepts

Training, Validation, and Test Sets

DfData Splitting

Given dataset $\mathcal{D}$ of size $N$ , we partition it into three disjoint subsets: training set $\mathcal{D}_{train}$ (typically 60-80%), validation set $\mathcal{D}_{val}$ (10-20%), and test set $\mathcal{D}_{test}$ (10-20%). The training set fits model parameters, the validation set tunes hyperparameters, and the test set provides an unbiased estimate of generalization performance. Formally, $\mathcal{D} = \mathcal{D}_{train} \cup \mathcal{D}_{val} \cup \mathcal{D}_{test}$ with pairwise disjoint intersections.

Bias-Variance Decomposition

ThBias-Variance Decomposition

For a model $\hat{f}$ trained on dataset $\mathcal{D}$ , the expected prediction error at a point $x$ can be decomposed as:

\mathbb{E}[(y - \hat{f}(x))^2] = \text{Bias}^2(\hat{f}(x)) + \text{Var}(\hat{f}(x)) + \sigma^2

where $\text{Bias}(\hat{f}(x)) = \mathbb{E}[\hat{f}(x)] - f(x)$ , $\text{Var}(\hat{f}(x)) = \mathbb{E}[(\hat{f}(x) - \mathbb{E}[\hat{f}(x)])^2]$ , and $\sigma^2$ is the irreducible noise.

Overfitting vs Underfitting

DfOverfitting

Overfitting occurs when a model learns the training data too well, including noise and random fluctuations, resulting in poor generalization. Formally, overfitting occurs when the model's test error increases while training error continues to decrease. This corresponds to a model with high variance and low bias.

DfUnderfitting

Underfitting occurs when a model is too simple to capture the underlying patterns in the data, resulting in poor performance on both training and test data. This corresponds to a model with high bias and low variance.

Common ML Algorithms

Key Takeaways

Summary: What is Machine Learning

ML learns patterns from data $\mathcal{D} = \{(x^{(i)}, y^{(i)})\}_{i=1}^N$ instead of explicit rules
Supervised learning: $f: \mathbb{R}^d \to \mathcal{Y}$ with labeled pairs (most common)
Unsupervised learning: discover $P(x)$ or latent structure without labels
Reinforcement learning: maximize $\mathbb{E}[\sum \gamma^k r_{t+k}]$ through trial and error
Always split data into train/validation/test sets to estimate generalization
Bias-variance tradeoff: $\text{Error} = \text{Bias}^2 + \text{Var} + \sigma^2$
Overfitting is the #1 problem — model memorizes instead of learns
Start simple, add complexity only when needed (Occam's Razor)
Data quality matters more than algorithm choice — garbage in, garbage out
The ML workflow is iterative — expect to repeat steps as you gain insights

What to Learn Next

-> Math Foundations Master the essential math — vectors, matrices, derivatives, and probability.

-> Linear Regression The simplest and most fundamental ML algorithm for predicting continuous values.

-> Logistic Regression Classification with probability — from linear to sigmoid.

-> KNN Instance-based learning where your neighbors tell the story.

-> Decision Trees If-then rules that learn — the most interpretable algorithm.

-> Model Evaluation How to know if your model actually works — beyond accuracy.

What is Machine Learning? — Complete Introduction

The Science of Getting Computers to Learn from Data

What is Machine Learning? — Complete Introduction

What is Machine Learning?

DfMachine Learning

Traditional Programming vs Machine Learning

Types of Machine Learning

Supervised Learning

DfSupervised Learning

Unsupervised Learning

DfUnsupervised Learning

Reinforcement Learning

DfReinforcement Learning

ML Algorithm Taxonomy

Key Applications

The ML Workflow

Key Concepts

Training, Validation, and Test Sets

DfData Splitting

Bias-Variance Decomposition

ThBias-Variance Decomposition

Overfitting vs Underfitting

DfOverfitting

DfUnderfitting

Common ML Algorithms

Key Takeaways

Summary: What is Machine Learning

What to Learn Next

Premium Content

Need Expert Machine Learning Help?