Why It Matters
Why It Matters
Taylor series are one of the most powerful tools in mathematics, providing a bridge between complicated functions and simple polynomials. They allow us to:
- Linearize nonlinear functions locally (the foundation of Newton's method and gradient descent analysis)
- Approximate transcendental functions like , , and using only arithmetic operations
- Analyze function behavior near a point using derivative information alone
- Derive fundamental results in physics, engineering, and machine learning
Every time you use
np.exp(),np.sin(), ornp.log()in code, the underlying computation relies on polynomial approximations rooted in Taylor series. Understanding them gives you insight into numerical computing, optimization, and the behavior of neural networks near initialization.
What is a Taylor Series
DfTaylor Series
A Taylor series represents a function as an infinite sum of terms calculated from the values of its derivatives at a single point . If is infinitely differentiable at , the Taylor series of about is:
Taylor Series (General Form)
Here,
- =The nth derivative of f evaluated at the center point a
- =The center (expansion point) of the series
- =Factorial of n (1Β·2Β·3Β·...Β·n)
- =The nth power of the distance from a
Intuition
Think of a Taylor series as building a polynomial approximation piece by piece:
- The 0th term matches the function value at
- The 1st term matches the slope at
- The 2nd term matches the curvature at
- Each higher-order term captures finer details of the function's shape
The more terms you include, the better the approximation becomes β and for many functions, the infinite sum converges to the exact function value.
Maclaurin Series
A Maclaurin series is simply a Taylor series centered at . It is the most commonly used form because derivatives at zero are often easier to compute.
Maclaurin Series
Here,
- =The nth derivative of f evaluated at 0
- =Factorial of n
- =The nth power of x (no shift needed since a=0)
Deriving the Maclaurin Series for $e^x$
Problem: Find the Maclaurin series for .
Solution:
- In general, for all
Substituting into the Maclaurin formula:
This series converges for all .
Common Taylor Expansions
Memorizing these standard expansions is essential β they appear constantly in calculus, differential equations, and numerical methods.
| Function | Maclaurin Series | Convergence Domain |
|---|---|---|
| All | ||
| All | ||
| All | ||
| All | ||
| All | ||
Pattern Recognition
- Odd functions (, , ) have only odd powers
- Even functions (, ) have only even powers
- The series for and have no factorial in the denominator
- The geometric series is the foundation for many other expansions
Convergence
Not every Taylor series converges to its original function. Understanding convergence is critical.
ThRadius of Convergence
For a power series , there exists a radius such that:
- The series converges absolutely for
- The series diverges for
- At the boundary , convergence must be checked separately
The radius can be computed using the Ratio Test:
or equivalently:
Finding the Radius of Convergence
Problem: Find the radius of convergence for .
Solution: Let . Then:
Therefore , meaning the series converges for all . This is why can be evaluated for any real number.
Convergence Caveats
- A Taylor series may converge but not to the original function (e.g., for , )
- This happens when the function is not analytic β the remainder does not tend to zero
- For most elementary functions encountered in practice, the Taylor series does converge to the function within its radius of convergence
Taylor's Theorem with Remainder
ThTaylor's Theorem (Lagrange Remainder)
If is -times differentiable on an interval containing and , then:
where the Lagrange form of the remainder is:
for some between and .
Lagrange Remainder (Error Bound)
Here,
- =The remainder (error) after n terms
- =Some unknown point between a and x
- =An upper bound: $M \geq \max_{t \in [a,x]} |f^{(n+1)}(t)|$
- =The center of the expansion
Bounding the Error
Problem: Approximate using the first 4 terms of the Maclaurin series for . Bound the error.
Solution: The first 4 terms give:
For the error bound, note that , so on :
Therefore:
The true value is , so the actual error is about , well within our bound.
Approximation Applications
Taylor series are used to compute function values efficiently in practice.
Computing $\sin(0.5)$ by Hand
Problem: Use the Maclaurin series for to compute to 4 decimal places.
Solution: The series is
With :
- Term 1:
- Term 2:
- Term 3:
- Term 4:
Sum:
Exact value: , so 4 terms give accuracy to 6 decimal places.
Computing $\pi$ Using $\arctan$
Problem: How many terms of the series are needed to compute to 10 decimal places using ?
Solution: Using , the error after terms is bounded by .
For 10 decimal places: , so .
This is impractical! Better to use the identity:
which converges much faster.
Polynomial Approximation
DfTaylor Polynomial
The th-degree Taylor polynomial of about is the partial sum:
This is the best polynomial approximation of degree in the sense that it matches and its first derivatives at .
Choosing the Right Number of Terms
The number of terms needed depends on:
- How far is from the center (larger needs more terms)
- How accurate you need the result (more terms = smaller error)
- The behavior of higher derivatives (if they grow slowly, fewer terms suffice)
Rule of thumb: For near , 10 terms give about 7 digits of accuracy for . For and , 5 terms often suffice for .
Error in Polynomial Approximation
Here,
- =The nth-degree Taylor polynomial
- =Maximum of $|f^{(n+1)}|$ on the interval between a and x
- =Degree of the polynomial
Python Implementation
Basic Taylor Approximation
import numpy as np
def taylor_exp(x, n_terms=10):
"""Approximate e^x using the Maclaurin series."""
result = 0
for k in range(n_terms):
result += x**k / np.math.factorial(k)
return result
# Compare with numpy
x = 1.0
print(f"Exact: {np.exp(x):.12f}")
print(f"Taylor: {taylor_exp(x, n_terms=10):.12f}")
# Output:
# Exact: 2.718281828459
# Taylor: 2.718281801146
Vectorized Taylor Approximation
import numpy as np
def taylor_sin(x, n_terms=8):
"""Vectorized Maclaurin series for sin(x)."""
result = np.zeros_like(x, dtype=float)
for k in range(n_terms):
sign = (-1)**k
power = 2*k + 1
result += sign * x**power / np.math.factorial(power)
return result
x = np.array([0.1, 0.5, 1.0, 2.0])
print(f"Exact: {np.sin(x)}")
print(f"Taylor: {taylor_sin(x)}")
print(f"Error: {np.abs(np.sin(x) - taylor_sin(x))}")
SymPy Symbolic Taylor Series
from sympy import symbols, series, sin, cos, exp, log, oo
x = symbols('x')
# Taylor series expansions
print("e^x:", series(exp(x), x, 0, n=6))
print("sin(x):", series(sin(x), x, 0, n=8))
print("cos(x):", series(cos(x), x, 0, n=8))
print("ln(1+x):", series(log(1+x), x, 0, n=6))
# Compute specific derivatives
from sympy import factorial
def taylor_coeff(f, x, a, n):
"""Get the nth Taylor coefficient at point a."""
return f.diff(x, n).subs(x, a) / factorial(n)
print(f" coefficient of x^3 in e^x: {taylor_coeff(exp(x), x, 0, 3)}")
Error Analysis
import numpy as np
import matplotlib.pyplot as plt
def taylor_error_analysis():
"""Compare Taylor approximations of different orders."""
x = np.linspace(-3, 3, 200)
exact = np.exp(x)
errors = {}
for n in [3, 5, 7, 10]:
approx = np.zeros_like(x)
for k in range(n + 1):
approx += x**k / np.math.factorial(k)
errors[n] = np.abs(exact - approx)
# Plot
plt.figure(figsize=(10, 6))
for n, err in errors.items():
plt.semilogy(x, err, label=f'n={n}')
plt.xlabel('x')
plt.ylabel('Absolute Error')
plt.title('Taylor Series Error for e^x')
plt.legend()
plt.grid(True)
plt.show()
taylor_error_analysis()
Applications in AI/ML
Linearization of Neural Networks
Neural Tangent Kernel (NTK)
The Neural Tangent Kernel describes how neural networks behave near initialization. For a network with parameters :
This is a first-order Taylor expansion around the initial parameters . The NTK is the kernel:
In the infinite-width limit, this linearization becomes exact during training, and the network behaves like a linear model in function space.
Gradient Descent Analysis
Taylor series explain why gradient descent works:
- The gradient is the first-order Taylor approximation
- Newton's method uses the second-order Taylor expansion for faster convergence
- Momentum methods can be viewed as modifying the Taylor approximation landscape
Taylor Series in Loss Functions
Many loss functions are analyzed using Taylor expansions:
- Cross-entropy loss near zero predictions (gradient explosion)
- Huber loss as a smooth approximation of MAE
- Softmax stability analysis (subtracting the max is equivalent to scaling the Taylor series)
Automatic Differentiation Connection
Automatic differentiation (AD) computes exact derivatives β the same derivatives needed for Taylor series. Modern ML frameworks like PyTorch and JAX use AD to:
- Compute gradients for backpropagation
- Enable higher-order derivatives for Taylor-based optimization
- Implement the Taylor-mode AD for efficient computation of Taylor coefficients
Common Mistakes
| Mistake | Why It's Wrong | Correct Approach |
|---|---|---|
| Using Taylor series outside the radius of convergence | The series diverges and gives meaningless results | Always check before using the series |
| Forgetting the factorial in the denominator | Coefficients are , not just | The comes from repeated differentiation of |
| Assuming all functions have convergent Taylor series | (extended by 0 at origin) has all derivatives = 0 at 0 | Check if as |
| Using the wrong center point | Taylor series about cannot approximate near well | Choose close to the values of interest |
| Mixing up and series | has odd powers starting with ; has even powers starting with 1 | Remember: (odd function), (even function) |
| Ignoring alternating signs | Forgetting in , , leads to completely wrong values | Alternating series have special convergence properties; pay attention to signs |
| Not checking error bounds | Using an approximation without knowing its accuracy | Always bound using the Lagrange remainder |
Interview Questions
Question 1: Why does have the simplest Maclaurin series?
Answer: Because all derivatives of are itself, and . So for all , giving the clean series . No alternating signs, no zeros β every term contributes equally.
Question 2: How would you approximate without a calculator?
Answer: Use the binomial series with and :
The exact value is , so 3 terms give 5 digits of accuracy.
Question 3: Why is the Taylor series for restricted to ?
Answer: The coefficients satisfy , giving . At , the series converges (alternating harmonic series), but at it becomes , which diverges (harmonic series).
Question 4: What's the relationship between Taylor series and Fourier series?
Answer: Both decompose functions into simpler basis functions:
- Taylor series use polynomial basis β best for local approximation
- Fourier series use trigonometric basis β best for periodic functions
Taylor series require differentiability; Fourier series work for discontinuous functions. In higher dimensions, Taylor series extend to multivariate polynomials, while Fourier series extend to the Fourier transform.
Question 5: How do Taylor series relate to the concept of analyticity?
Answer: A function is analytic at a point if its Taylor series converges to the function in some neighborhood. All polynomial, exponential, trigonometric, and rational functions (away from poles) are analytic. However, (with ) is smooth but not analytic β its Taylor series at 0 is identically zero, yet the function is nonzero for .
Question 6: Can you use Taylor series to solve differential equations?
Answer: Yes β the power series method assumes , substitutes into the ODE, and solves for the coefficients recursively. For example, solving with gives , recovering . This method works for linear ODEs with polynomial coefficients and is the basis for special functions like Bessel functions and Legendre polynomials.
Practice Problems
Problem 1: Derive the Series
Problem: Derive the Maclaurin Series
Find the Maclaurin series for and determine its radius of convergence.
Solution
Method 1 (Substitution): Since for , substitute :
Method 2 (Derivatives):
The pattern: even derivatives alternate , odd derivatives are 0 at .
Radius of convergence: (from , we need , so ).
Note: This series is the basis for the Leibniz formula for :
obtained by integrating from 0 to 1.
Problem 2: Error Estimation
Problem: How Many Terms for 6-Digit Accuracy?
How many terms of the Maclaurin series for are needed to ensure the result is accurate to 6 decimal places?
Solution
The Maclaurin series for is an alternating series: .
For alternating series, the error is bounded by the first omitted term:
With , we need .
Checking:
- : (not enough)
- : β
Answer: 4 terms () are sufficient.
Verification:
Exact: β
Problem 3: Non-Standard Expansion
Problem: Expansion About a Non-Zero Center
Find the first four nonzero terms of the Taylor series for about .
Solution
Since for all , we have .
The Taylor series about is:
The first four nonzero terms:
At :
Exact: , so 4 terms give about 2% error. Adding the term () reduces this dramatically.
Problem 4: Real-World Application
Problem: Small Angle Approximation
A pendulum has length m. For small angles, the period is . Use Taylor series to find a correction term for the period when the amplitude is .
Solution
The exact period involves an elliptic integral, but we can use the Taylor expansion of .
The corrected period is approximately:
With radians:
So the period is about 1.7% longer than the small-angle prediction. For , the correction is about 3.8%, showing the approximation breaks down for large angles.
Problem 5: Series Manipulation
Problem: Deriving New Series from Known Ones
Use known Maclaurin series to find the series for (which is ).
Solution
We know:
Subtracting:
Therefore:
This matches the known series for β note the similarity to but without alternating signs.
Quick Reference
| Concept | Formula/Key Point |
|---|---|
| Taylor Series | |
| Maclaurin Series | Same with : |
| , all | |
| , all | |
| , all | |
| , | |
| , | |
| , | |
| , | |
| Lagrange Remainder | |
| Radius of Convergence | |
| Alternating Series Error | first omitted term |
| Pattern: Odd Functions | , , have only odd powers |
| Pattern: Even Functions | , have only even powers |
Cross-References
- Sequences and Series β foundations for understanding convergence
- Power Series β the general framework for Taylor series
- Maclaurin Series β the special case at
- L'HΓ΄pital's Rule β can be proved using Taylor expansions
- Newton's Method β uses first-order Taylor approximation for root finding
- Numerical Integration β Simpson's rule is based on quadratic Taylor approximation
- Fourier Series β alternative decomposition using trigonometric basis
- Differential Equations β power series method for solving ODEs
- Optimization β second-order methods use Hessian (second Taylor term)
- Automatic Differentiation β computes exact Taylor coefficients via the chain rule
- Neural Tangent Kernel β linearization of neural networks via first-order Taylor expansion