Eigenvalues & Eigenvectors

The Ultimate Data Science Tutorial — Deep Theory, Real-World Applications & Step-by-Step Solutions

Table of Contents

Part I — Foundation & Intuition
  1. The Big Idea (Intuition)
  2. Formal Definition
  3. Geometric Interpretation
  4. Types of Eigenvalues & What They Mean
Part II — Why, When & What Problems
  1. WHY Use Eigenvalues & Eigenvectors?
  2. WHEN Do We Use Them?
  3. Real-World Problems (20+ Applications)
Part III — How to Compute
  1. How to Find Eigenvalues — Step by Step
  2. How to Find Eigenvectors — Step by Step
Part IV — Worked Examples & Solutions
  1. Worked Example 1 — 2×2 Matrix
  2. Worked Example 2 — 3×3 Matrix
  3. Worked Example 3 — PCA Application
  4. Worked Example 4 — Markov Chain Steady State
  5. Worked Example 5 — Differential Equations
Part V — Deep Theory & Related Concepts
  1. Eigendecomposition (Diagonalization)
  2. Singular Value Decomposition (SVD) vs Eigen
  3. The Spectral Theorem
  4. Related Concepts Map
  5. All Key Properties & Theorems
Part VI — Practice & Reference
  1. Practice Problems (with Solutions)
  2. Common Mistakes to Avoid
  3. Quick Reference Recipe
  4. Python / NumPy Cheat Sheet
Part I — Foundation & Intuition

1. The Big Idea (Intuition)

Imagine you have a rubber sheet with a grid drawn on it. You apply a transformation — stretch it, shear it, rotate it. Most arrows drawn on the sheet change direction. But some special arrows only get longer or shorter along their original line. They refuse to rotate.

Those stubborn arrows are eigenvectors. The factor by which they stretch (or shrink) is the eigenvalue.

Analogy for Real Life

Think of an earthquake shaking a building. The building vibrates in certain natural modes — some floors sway left-right, others twist. Each mode is an eigenvector (the shape of vibration) and the frequency of that mode corresponds to an eigenvalue. The building "wants" to vibrate in these special directions.

The word "eigen" comes from German meaning "own" or "characteristic." So eigenvectors are the characteristic directions of a transformation — the directions that belong to that matrix.

2. Formal Definition

Given a square matrix \(\mathbf{A}\) of size \(n \times n\), a scalar \(\lambda\) and a non-zero vector \(\mathbf{v}\) satisfy:

$$\boxed{\mathbf{A}\mathbf{v} = \lambda\,\mathbf{v}}$$

where:

Plain English Translation
"Multiplying matrix \(\mathbf{A}\) by vector \(\mathbf{v}\) gives the same result as multiplying \(\mathbf{v}\) by the number \(\lambda\). The matrix doesn't change the direction of \(\mathbf{v}\) — it only scales it."
Important Note

If \(\mathbf{v}\) is an eigenvector, then any scalar multiple \(c\mathbf{v}\) (where \(c \neq 0\)) is also an eigenvector with the same eigenvalue. That's why we often normalize eigenvectors to have length 1, or just pick a convenient representative.

3. Geometric Interpretation

Understanding what different eigenvalues look like geometrically is crucial for intuition:

Eigenvalue \(\lambda\) What Happens to \(\mathbf{v}\) Visual
\(\lambda > 1\) Stretched (gets longer) → ———→
\(\lambda = 1\) Unchanged (fixed direction AND length) → →
\(0 < \lambda < 1\) Shrunk (gets shorter) ———→ →
\(\lambda = 0\) Collapsed to zero (projected out) → ·
\(\lambda < 0\) Flipped (reversed direction) and scaled → ←
\(\lambda = a + bi\) (complex) Rotation + scaling (spiral) → ↻
Key Geometric Insight

A 2×2 matrix has (at most) 2 eigenvectors. Think of them as the two axes along which the transformation acts purely as stretching. Every other vector is a mix of these two directions and will appear to rotate because its two components stretch by different amounts.

4. Types of Eigenvalues & What They Mean

4.1 Real vs Complex Eigenvalues

Type Meaning Where You See It
Real, distinct Clear, separate scaling directions PCA, covariance matrices
Real, repeated Uniform scaling in a subspace Scalar multiples of identity
Complex conjugate pairs Rotation + spiral behavior Oscillating systems, control theory
Pure imaginary Pure rotation, no growth/decay Undamped oscillations

4.2 Special Eigenvalue Cases

\(\lambda = 0\) — Singular Matrix

If any eigenvalue is zero, the matrix is singular (non-invertible). The eigenvector for \(\lambda=0\) lies in the null space of \(\mathbf{A}\). This means \(\mathbf{A}\) collapses some dimension to zero — information is lost.

$$\mathbf{A}\mathbf{v} = 0 \cdot \mathbf{v} = \mathbf{0}$$

\(\lambda = 1\) — Fixed Points

Vectors with eigenvalue 1 are unchanged by the transformation. In Markov chains, the steady-state vector has \(\lambda = 1\). In projections, the subspace being projected onto has \(\lambda = 1\).

\(|\lambda| > 1\) vs \(|\lambda| < 1\) — Stability

In dynamical systems: eigenvalues with \(|\lambda| > 1\) cause exponential growth (unstable), while \(|\lambda| < 1\) cause exponential decay (stable). This is the foundation of stability analysis.

4.3 Algebraic vs Geometric Multiplicity

Definition

Algebraic multiplicity = how many times \(\lambda\) appears as a root of the characteristic polynomial.

Geometric multiplicity = number of linearly independent eigenvectors for that \(\lambda\) = dimension of eigenspace = \(\dim\ker(\mathbf{A} - \lambda\mathbf{I})\).

Always: \(1 \leq \text{geometric mult.} \leq \text{algebraic mult.}\)

Part II — Why, When & What Problems

5. WHY Use Eigenvalues & Eigenvectors?

This is perhaps the most important section. Eigenvalues and eigenvectors are not just an abstract math concept — they solve fundamental problems that appear across every scientific and engineering discipline.

5.1 They Reveal Hidden Structure

A matrix can represent a complicated transformation — shearing, stretching, rotating all at once. Eigenvalues and eigenvectors decompose this mess into simple, independent stretching motions along specific axes. It's like taking a complex sound wave and decomposing it into individual pure frequencies (Fourier transform is deeply connected to eigenvalues!).

$$\text{Complex transformation} \xrightarrow{\text{eigen-analysis}} \text{Simple scalings along independent directions}$$

5.2 They Enable Dimensionality Reduction

In data science, you often have hundreds or thousands of features. Eigenvalues tell you which directions matter most. If 3 eigenvalues are huge and 997 are tiny, your data effectively lives in a 3D subspace. PCA uses exactly this idea to reduce dimensions while preserving maximum information.

$$\text{Variance explained by component } i = \frac{\lambda_i}{\sum_j \lambda_j} \times 100\%$$

5.3 They Simplify Matrix Powers & Exponentials

Computing \(\mathbf{A}^{100}\) directly requires 99 matrix multiplications. But if you know the eigendecomposition \(\mathbf{A} = \mathbf{P}\mathbf{D}\mathbf{P}^{-1}\), then:

$$\mathbf{A}^{100} = \mathbf{P}\mathbf{D}^{100}\mathbf{P}^{-1}$$

And \(\mathbf{D}^{100}\) is trivial — just raise each diagonal eigenvalue to the 100th power. This is how we solve Markov chains, recurrence relations, and differential equations efficiently.

5.4 They Determine System Stability

In engineering and physics, the eigenvalues of a system matrix tell you whether the system is stable, unstable, or oscillatory:

This is the foundation of control theory — designing systems (autopilot, robotics, etc.) that behave reliably.

5.5 They Solve Differential Equations

Systems of linear differential equations \(\frac{d\mathbf{x}}{dt} = \mathbf{A}\mathbf{x}\) have solutions of the form:

$$\mathbf{x}(t) = c_1 e^{\lambda_1 t}\mathbf{v}_1 + c_2 e^{\lambda_2 t}\mathbf{v}_2 + \cdots$$

Each eigenvector \(\mathbf{v}_i\) is an independent mode of behavior, and each eigenvalue \(\lambda_i\) determines whether that mode grows, decays, or oscillates.

5.6 They Power Ranking Algorithms

Google's PageRank models the web as a huge matrix where entry \((i,j)\) represents the probability of clicking from page \(j\) to page \(i\). The dominant eigenvector (eigenvalue = 1) of this matrix gives the importance ranking of every web page. This eigenvector is the steady-state of a random web surfer.

Summary — Why Eigenvalues Matter

They let you: (1) understand the geometry of transformations, (2) compress data, (3) compute matrix powers efficiently, (4) determine if systems are stable, (5) solve differential equations, and (6) rank things. Almost every area of applied math reduces to an eigenvalue problem at some point.

6. WHEN Do We Use Them?

Here's a practical decision guide — when should you reach for eigenvalue/eigenvector analysis?

Situation Signal to Use Eigen-Analysis Technique
Too many features in your dataset Need to reduce dimensions PCA (eigenvectors of covariance matrix)
Understanding correlations in data Covariance matrix is symmetric → guaranteed real eigenvalues Eigendecomposition of \(\mathbf{\Sigma}\)
Random walker on a graph/network Need steady-state probability Eigenvector with \(\lambda=1\) of transition matrix
Grouping similar items with graph structure Data has network/graph connections Spectral clustering (eigenvectors of Laplacian)
Will this system blow up over time? Studying a dynamical system Check eigenvalues of system matrix
Solving \(\frac{d\mathbf{x}}{dt} = \mathbf{A}\mathbf{x}\) System of linear ODEs General solution via eigenvalues/vectors
Computing \(\mathbf{A}^n\) for large \(n\) Recurrence relations, Markov chains Diagonalization: \(\mathbf{A}^n = \mathbf{P}\mathbf{D}^n\mathbf{P}^{-1}\)
Recommending products to users User-item matrix is huge and sparse SVD / low-rank approximation
Image compression Want to store less data with minimal loss SVD (closely related to eigendecomposition)
NLP: understanding word relationships Co-occurrence matrix is large SVD for word embeddings (like LSA)
Vibration / structural analysis Finding natural frequencies Generalized eigenvalue problem
Quantum mechanics Measuring observable quantities Eigenvalues of Hermitian operators
Rule of Thumb

Whenever you see a square matrix that represents relationships, transitions, transformations, or correlations — eigenvalue analysis will likely reveal something important about it. If you're asking "what are the most important directions/modes/patterns?" — you want eigenvectors.

7. Real-World Problems Solved by Eigenvalues & Eigenvectors

7.1 — Data Science & Machine Learning

Principal Component Analysis (PCA)

Data Science ML Statistics

Problem: You have a dataset with 500 features. Training is slow and there's noise. Which features matter most?

How eigen helps: Compute the covariance matrix \(\mathbf{\Sigma}\) of your data. Its eigenvectors point in the directions of maximum variance (principal components). The eigenvalues tell you how much variance each direction explains. Keep only the top \(k\) eigenvectors to reduce 500 features to, say, 20 — losing almost no information.

$$\mathbf{\Sigma} = \frac{1}{n-1}\mathbf{X}^T\mathbf{X}, \quad \mathbf{\Sigma}\mathbf{v}_i = \lambda_i\mathbf{v}_i$$

Recommendation Systems (Netflix, Spotify)

ML Industry

Problem: Millions of users, millions of items, very sparse rating matrix. How to predict what a user will like?

How eigen helps: SVD (built on eigendecomposition) factors the user-item matrix into latent factors. The top eigenvalues/singular values capture the dominant "taste dimensions" — maybe 50 dimensions can represent all the important patterns in millions of ratings.

$$\mathbf{R} \approx \mathbf{U}_k \mathbf{\Sigma}_k \mathbf{V}_k^T$$

Spectral Clustering

ML Graph Theory

Problem: You have data where clusters have irregular shapes — k-means fails badly. But you know which points are "similar."

How eigen helps: Build a similarity graph. Compute the Laplacian matrix \(\mathbf{L} = \mathbf{D} - \mathbf{W}\). The smallest eigenvalues (near zero) of \(\mathbf{L}\) reveal cluster structure — the number of zero eigenvalues equals the number of connected components. The corresponding eigenvectors embed your data into a space where k-means works perfectly.

Natural Language Processing (LSA / LSI)

NLP Information Retrieval

Problem: A term-document matrix is huge and sparse. Words with similar meanings should be grouped together.

How eigen helps: Latent Semantic Analysis uses SVD to find the hidden (latent) semantic structure. The top singular vectors capture topics — grouping synonyms and separating polysemy automatically.

Face Recognition (Eigenfaces)

Computer Vision ML

Problem: Each face image is a 10,000-pixel vector. How to recognize faces efficiently?

How eigen helps: Apply PCA to a training set of faces. The eigenvectors of the covariance matrix form "eigenfaces" — ghostly base patterns. Any face can be approximated as a weighted sum of ~100 eigenfaces. Recognition becomes comparing 100 weights instead of 10,000 pixels.

7.2 — Physics & Engineering

Structural Engineering — Vibration Analysis

Civil Engineering Mechanics

Problem: Will this bridge resonate with wind? At what frequencies might this building collapse during an earthquake?

How eigen helps: The generalized eigenvalue problem \(\mathbf{K}\mathbf{v} = \omega^2 \mathbf{M}\mathbf{v}\) (stiffness and mass matrices) gives the natural frequencies (\(\omega\)) and mode shapes (eigenvectors) of the structure. Engineers design so that no natural frequency matches expected vibration sources.

$$\mathbf{K}\mathbf{v} = \omega^2 \mathbf{M}\mathbf{v}$$

Quantum Mechanics

Physics

Problem: What energy levels can an electron have in an atom? What states can a quantum system be in?

How eigen helps: Observable quantities (energy, momentum, spin) are represented by Hermitian operators. The eigenvalues are the possible measurement outcomes, and the eigenvectors are the corresponding quantum states. The famous Schrödinger equation is an eigenvalue problem:

$$\hat{H}\psi = E\psi$$

where \(\hat{H}\) is the Hamiltonian operator, \(E\) is the energy eigenvalue, and \(\psi\) is the wave function (eigenvector).

Control Systems — Autopilot, Robotics

Electrical Engineering Robotics

Problem: Design a controller for a drone so it stays stable in wind.

How eigen helps: Model the drone dynamics as \(\dot{\mathbf{x}} = \mathbf{A}\mathbf{x} + \mathbf{B}\mathbf{u}\). The eigenvalues of \(\mathbf{A}\) (or of the closed-loop matrix \(\mathbf{A}-\mathbf{B}\mathbf{K}\)) determine stability. Engineers place eigenvalues in the left half of the complex plane to ensure the system is stable and responsive.

Electrical Circuits — Power Systems

Electrical Engineering

Problem: Analyzing oscillation modes in a large power grid to prevent blackouts.

How eigen helps: The state matrix of the power system has eigenvalues that correspond to different oscillation modes. If any eigenvalue crosses into the right half-plane, the grid becomes unstable. Operators monitor eigenvalues in real-time.

7.3 — Google, Social Networks & Graphs

Google PageRank

Web Search Graph Theory

Problem: Rank billions of web pages by importance.

How eigen helps: Model the web as a directed graph. The transition matrix \(\mathbf{M}\) describes a random surfer clicking links. The dominant eigenvector (for \(\lambda = 1\)) gives the long-run probability of being on each page — this IS the PageRank.

$$\mathbf{M}\mathbf{r} = \mathbf{r} \quad (\text{i.e., } \lambda = 1)$$

Community Detection in Social Networks

Network Science Social Media

Problem: Find communities (friend groups, interest clusters) in a social network of millions of users.

How eigen helps: The eigenvectors of the modularity matrix reveal community structure. Nodes that share similar eigenvector components belong to the same community. This is how Facebook and Twitter detect user groups at scale.

7.4 — Finance & Economics

Portfolio Optimization & Risk Analysis

Finance Risk Management

Problem: You hold 100 stocks. How correlated are they? What hidden risk factors drive your portfolio?

How eigen helps: Eigen-analysis of the correlation matrix reveals the principal risk factors. The largest eigenvalue typically corresponds to the "market factor" (all stocks move together). Smaller eigenvalues reveal sector-specific risks. This is the foundation of factor models (like Fama-French).

Economic Input-Output Models (Leontief)

Economics

Problem: How does a change in one industry's output affect the entire economy?

How eigen helps: Leontief's input-output model uses eigenvalues of the technology matrix to determine if an economy can sustain itself. The dominant eigenvalue (called the Perron-Frobenius eigenvalue) must be less than 1 for the economy to be productive.

7.5 — Biology & Medicine

Population Dynamics (Leslie Matrix)

Ecology Biology

Problem: Will this animal species grow, decline, or stabilize over time?

How eigen helps: The Leslie matrix encodes birth rates and survival rates for each age group. Its dominant eigenvalue \(\lambda_1\) determines long-term behavior: if \(\lambda_1 > 1\), population grows; if \(\lambda_1 < 1\), it declines; if \(\lambda_1 = 1\), it stabilizes. The eigenvector gives the stable age distribution.

Genomics & Bioinformatics (PCA on Gene Expression)

Bioinformatics Medicine

Problem: Gene expression data has 20,000+ genes. Which genes distinguish cancer subtypes?

How eigen helps: PCA on gene expression matrices reveals the dominant patterns. The first few principal components (eigenvectors) often correspond to biological processes like cell cycle, immune response, or tissue type. This is used for cancer subtype classification and drug response prediction.

Epidemic Modeling

Epidemiology Public Health

Problem: Will a disease become an epidemic? How fast will it spread?

How eigen helps: The basic reproduction number \(R_0\) is the dominant eigenvalue of the next-generation matrix. If \(R_0 > 1\), the disease spreads; if \(R_0 < 1\), it dies out. This was crucial for COVID-19 modeling.

7.6 — Image Processing & Computer Graphics

Image Compression (SVD)

Computer Science Signal Processing

Problem: A 1000×1000 grayscale image has 1,000,000 pixel values. Store it with much less data.

How eigen helps: SVD decomposes the image matrix into singular values. Keep only the top \(k\) singular values (and corresponding vectors). With \(k=50\), you reduce storage by ~90% with barely noticeable quality loss. Each singular value tells you how much "information" that component adds.

3D Graphics — Rotation & Deformation

Computer Graphics Animation

Problem: Decompose a complex 3D deformation into rotation + stretching.

How eigen helps: The polar decomposition (which uses eigenvalues) separates any transformation into a pure rotation and a pure stretch. This is essential for physics-based animation, mesh deformation, and motion capture processing.

The Unifying Theme

Every application above reduces to the same core idea: find the most important directions (eigenvectors) and their importance (eigenvalues) in a system described by a matrix. The matrix might represent data correlations, network connections, physical forces, transition probabilities, or quantum states — but the math is the same.

Part III — How to Compute

8. How to Find Eigenvalues — Step by Step

Starting from \(\mathbf{A}\mathbf{v} = \lambda\,\mathbf{v}\), we derive the method:

The Derivation

$$\mathbf{A}\mathbf{v} = \lambda\,\mathbf{v}$$
$$\mathbf{A}\mathbf{v} - \lambda\,\mathbf{v} = \mathbf{0}$$
$$\mathbf{A}\mathbf{v} - \lambda\,\mathbf{I}\mathbf{v} = \mathbf{0}$$
$$(\mathbf{A} - \lambda\,\mathbf{I})\,\mathbf{v} = \mathbf{0}$$

For \(\mathbf{v} \neq \mathbf{0}\) to exist, the matrix \((\mathbf{A} - \lambda\mathbf{I})\) must be singular:

Characteristic Equation
$$\det(\mathbf{A} - \lambda\,\mathbf{I}) = 0$$

This is a polynomial of degree \(n\) in \(\lambda\). Its roots are the eigenvalues. This polynomial is called the characteristic polynomial.

Shortcut for 2×2 Matrices

For \(\mathbf{A} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}\), the characteristic equation is always:

$$\lambda^2 - (a+d)\lambda + (ad - bc) = 0$$

That is: \(\lambda^2 - \text{trace}\cdot\lambda + \det = 0\). You can use the quadratic formula directly!

9. How to Find Eigenvectors — Step by Step

For each eigenvalue \(\lambda_i\), solve:

$$(\mathbf{A} - \lambda_i\,\mathbf{I})\,\mathbf{v} = \mathbf{0}$$

This is a homogeneous system. Use Gaussian elimination (row reduction) on the matrix \((\mathbf{A} - \lambda_i\mathbf{I})\) to find the free variables, then express the solution in terms of those free variables.

The Eigenspace

The set of ALL eigenvectors for a given \(\lambda\) (plus the zero vector) forms a subspace called the eigenspace \(E_\lambda\). Its dimension equals the geometric multiplicity of \(\lambda\).

$$E_\lambda = \ker(\mathbf{A} - \lambda\mathbf{I}) = \{\mathbf{v} : (\mathbf{A} - \lambda\mathbf{I})\mathbf{v} = \mathbf{0}\}$$
Part IV — Worked Examples & Solutions

10. Worked Example 1 — 2×2 Matrix

Problem: Find the eigenvalues and eigenvectors of:

$$\mathbf{A} = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}$$

Step 1: Characteristic Equation

$$\mathbf{A} - \lambda\mathbf{I} = \begin{pmatrix} 4-\lambda & 1 \\ 2 & 3-\lambda \end{pmatrix}$$
$$\det(\mathbf{A} - \lambda\mathbf{I}) = (4-\lambda)(3-\lambda) - (1)(2) = \lambda^2 - 7\lambda + 10$$

Step 2: Solve

$$\lambda^2 - 7\lambda + 10 = 0 \implies (\lambda - 5)(\lambda - 2) = 0$$
Eigenvalues
$$\lambda_1 = 5, \qquad \lambda_2 = 2$$

Quick check: \(\lambda_1 + \lambda_2 = 7 = 4+3 = \text{trace}(\mathbf{A})\) ✓ and \(\lambda_1 \cdot \lambda_2 = 10 = 4\cdot3 - 1\cdot2 = \det(\mathbf{A})\) ✓

Step 3: Eigenvector for \(\lambda_1 = 5\)

$$(\mathbf{A} - 5\mathbf{I})\mathbf{v} = \begin{pmatrix} -1 & 1 \\ 2 & -2 \end{pmatrix}\begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix}$$

Row 1: \(-v_1 + v_2 = 0 \implies v_2 = v_1\). Let \(v_1 = 1\):

$$\mathbf{v}_1 = \begin{pmatrix} 1 \\ 1 \end{pmatrix}$$

Step 4: Eigenvector for \(\lambda_2 = 2\)

$$(\mathbf{A} - 2\mathbf{I})\mathbf{v} = \begin{pmatrix} 2 & 1 \\ 2 & 1 \end{pmatrix}\begin{pmatrix} v_1 \\ v_2 \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \end{pmatrix}$$

Row 1: \(2v_1 + v_2 = 0 \implies v_2 = -2v_1\). Let \(v_1 = 1\):

$$\mathbf{v}_2 = \begin{pmatrix} 1 \\ -2 \end{pmatrix}$$

Step 5: Verify

$$\mathbf{A}\mathbf{v}_1 = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}\begin{pmatrix} 1 \\ 1 \end{pmatrix} = \begin{pmatrix} 5 \\ 5 \end{pmatrix} = 5\begin{pmatrix} 1 \\ 1 \end{pmatrix} \quad \checkmark$$
$$\mathbf{A}\mathbf{v}_2 = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}\begin{pmatrix} 1 \\ -2 \end{pmatrix} = \begin{pmatrix} 2 \\ -4 \end{pmatrix} = 2\begin{pmatrix} 1 \\ -2 \end{pmatrix} \quad \checkmark$$

11. Worked Example 2 — 3×3 Matrix

Problem: Find the eigenvalues and eigenvectors of:

$$\mathbf{A} = \begin{pmatrix} 2 & 0 & 0 \\ 0 & 3 & 1 \\ 0 & 1 & 3 \end{pmatrix}$$

Step 1: Characteristic Equation

Expanding along the first row:

$$\det(\mathbf{A} - \lambda\mathbf{I}) = (2-\lambda)\bigl[(3-\lambda)^2 - 1\bigr] = (2-\lambda)(\lambda-4)(\lambda-2)$$
Eigenvalues
$$\lambda_1 = 2 \;\text{(multiplicity 2)}, \qquad \lambda_2 = 4$$

Step 2: Eigenvector for \(\lambda = 4\)

$$\mathbf{A} - 4\mathbf{I} = \begin{pmatrix} -2 & 0 & 0 \\ 0 & -1 & 1 \\ 0 & 1 & -1 \end{pmatrix} \xrightarrow{\text{RREF}} \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 0 \end{pmatrix}$$

\(v_1 = 0\), \(v_2 = v_3\). Choose \(v_3 = 1\):

$$\mathbf{v} = \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix}$$

Step 3: Eigenvectors for \(\lambda = 2\) (multiplicity 2)

$$\mathbf{A} - 2\mathbf{I} = \begin{pmatrix} 0 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 1 & 1 \end{pmatrix} \xrightarrow{\text{RREF}} \begin{pmatrix} 0 & 0 & 0 \\ 0 & 1 & 1 \\ 0 & 0 & 0 \end{pmatrix}$$

\(v_2 = -v_3\), \(v_1\) is free. Two free variables → two independent eigenvectors:

$$\mathbf{v}_a = \begin{pmatrix} 1 \\ 0 \\ 0 \end{pmatrix}, \quad \mathbf{v}_b = \begin{pmatrix} 0 \\ -1 \\ 1 \end{pmatrix}$$

Geometric multiplicity = 2 = algebraic multiplicity. ✓ (Matrix is diagonalizable!)

12. Worked Example 3 — PCA Application

Problem: You measured height and weight of 100 students. The covariance matrix is:

$$\mathbf{C} = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}$$

Find the principal components and determine how much you can compress the data.

Step 1: Find Eigenvalues

$$\lambda^2 - (4+3)\lambda + (4\cdot3 - 2\cdot2) = 0 \implies \lambda^2 - 7\lambda + 8 = 0$$
$$\lambda = \frac{7 \pm \sqrt{49-32}}{2} = \frac{7 \pm \sqrt{17}}{2}$$
$$\lambda_1 \approx 5.56, \qquad \lambda_2 \approx 1.44$$

Step 2: Find Eigenvectors

For \(\lambda_1 \approx 5.56\): solving \((\mathbf{C} - 5.56\mathbf{I})\mathbf{v} = \mathbf{0}\):

$$\begin{pmatrix} -1.56 & 2 \\ 2 & -2.56 \end{pmatrix}\mathbf{v} = \mathbf{0} \implies v_2 = 0.78\,v_1$$

Normalized: \(\mathbf{v}_1 \approx \begin{pmatrix} 0.79 \\ 0.62 \end{pmatrix}\) — this points roughly 38° from the height axis.

For \(\lambda_2 \approx 1.44\): the eigenvector is perpendicular (symmetric matrix!):

\(\mathbf{v}_2 \approx \begin{pmatrix} -0.62 \\ 0.79 \end{pmatrix}\)

Step 3: Interpretation

$$\text{PC1 variance explained} = \frac{5.56}{5.56 + 1.44} = \frac{5.56}{7} = \mathbf{79.4\%}$$
$$\text{PC2 variance explained} = \frac{1.44}{7} = \mathbf{20.6\%}$$

PC1 (eigenvector 1): A combined "body size" factor — when height increases, weight tends to increase proportionally along this direction.

PC2 (eigenvector 2): The "body shape" factor — variation perpendicular to the main trend (tall-thin vs short-heavy deviation).

Conclusion: By projecting onto PC1 alone, you capture 79.4% of the variance — reducing 2D data to 1D while losing only 20.6% of information.

13. Worked Example 4 — Markov Chain Steady State

Problem: A customer is either Happy (H) or Unhappy (U). Each month: 80% of happy customers stay happy, 20% become unhappy. 60% of unhappy customers become happy, 40% stay unhappy. What's the long-run distribution?

Step 1: Transition Matrix

$$\mathbf{P} = \begin{pmatrix} 0.8 & 0.6 \\ 0.2 & 0.4 \end{pmatrix}$$

Columns sum to 1 (it's a stochastic matrix). Each column represents "where do people in this state go next?"

Step 2: Find Eigenvalues

$$\lambda^2 - 1.2\lambda + 0.2 = 0 \implies (\lambda - 1)(\lambda - 0.2) = 0$$

\(\lambda_1 = 1\) (guaranteed for stochastic matrices!) and \(\lambda_2 = 0.2\).

Step 3: Eigenvector for \(\lambda = 1\) (Steady State!)

$$(\mathbf{P} - \mathbf{I})\mathbf{v} = \begin{pmatrix} -0.2 & 0.6 \\ 0.2 & -0.6 \end{pmatrix}\mathbf{v} = \mathbf{0}$$

\(-0.2v_1 + 0.6v_2 = 0 \implies v_1 = 3v_2\). Choose \(v_2 = 1\): \(\mathbf{v} = \begin{pmatrix} 3 \\ 1 \end{pmatrix}\)

Normalize to sum to 1 (probability): \(\boldsymbol{\pi} = \begin{pmatrix} 3/4 \\ 1/4 \end{pmatrix} = \begin{pmatrix} 0.75 \\ 0.25 \end{pmatrix}\)

Interpretation

In the long run, 75% of customers are happy and 25% are unhappy, regardless of the initial distribution. The eigenvalue \(\lambda_2 = 0.2\) tells us the system converges to this steady state — the rate \(0.2^n \to 0\) means convergence is fast.

14. Worked Example 5 — Differential Equations

Problem: Solve the system of differential equations:

$$\frac{dx}{dt} = 4x + y, \qquad \frac{dy}{dt} = 2x + 3y$$

Step 1: Write in Matrix Form

$$\frac{d}{dt}\begin{pmatrix} x \\ y \end{pmatrix} = \begin{pmatrix} 4 & 1 \\ 2 & 3 \end{pmatrix}\begin{pmatrix} x \\ y \end{pmatrix}$$

This is \(\dot{\mathbf{x}} = \mathbf{A}\mathbf{x}\), the same matrix from Example 1!

Step 2: Use Our Eigenvalues & Eigenvectors

From Example 1: \(\lambda_1 = 5, \;\mathbf{v}_1 = \begin{pmatrix}1\\1\end{pmatrix}\) and \(\lambda_2 = 2, \;\mathbf{v}_2 = \begin{pmatrix}1\\-2\end{pmatrix}\)

Step 3: General Solution

$$\begin{pmatrix} x(t) \\ y(t) \end{pmatrix} = c_1 e^{5t}\begin{pmatrix} 1 \\ 1 \end{pmatrix} + c_2 e^{2t}\begin{pmatrix} 1 \\ -2 \end{pmatrix}$$

Written out:

$$x(t) = c_1 e^{5t} + c_2 e^{2t}$$
$$y(t) = c_1 e^{5t} - 2c_2 e^{2t}$$

Step 4: Physical Interpretation

Both eigenvalues are positive (\(\lambda = 5, 2\)), so this is an unstable node — both modes grow exponentially. The \(e^{5t}\) mode dominates for large \(t\), so the solution eventually aligns with eigenvector \(\begin{pmatrix}1\\1\end{pmatrix}\) — both \(x\) and \(y\) grow at equal rates.

If the eigenvalues were negative, the system would decay to zero (stable). If one were positive and one negative, we'd get a saddle point.

Part V — Deep Theory & Related Concepts

15. Eigendecomposition (Diagonalization)

If an \(n \times n\) matrix \(\mathbf{A}\) has \(n\) linearly independent eigenvectors, we can write:

Eigendecomposition
$$\mathbf{A} = \mathbf{P}\mathbf{D}\mathbf{P}^{-1}$$

where \(\mathbf{P}\) has eigenvectors as columns and \(\mathbf{D}\) is diagonal with eigenvalues:

$$\mathbf{P} = \begin{pmatrix} | & | & & | \\ \mathbf{v}_1 & \mathbf{v}_2 & \cdots & \mathbf{v}_n \\ | & | & & | \end{pmatrix}, \quad \mathbf{D} = \begin{pmatrix} \lambda_1 & & \\ & \lambda_2 & \\ & & \ddots & \\ & & & \lambda_n \end{pmatrix}$$

Why is this powerful?

Matrix Powers Become Trivial

$$\mathbf{A}^k = \mathbf{P}\mathbf{D}^k\mathbf{P}^{-1} = \mathbf{P}\begin{pmatrix} \lambda_1^k & & \\ & \lambda_2^k & \\ & & \ddots \end{pmatrix}\mathbf{P}^{-1}$$

Computing \(\mathbf{A}^{1000}\) reduces to computing \(\lambda_i^{1000}\) — scalar exponentiation!

Matrix Exponential (for Differential Equations)

$$e^{\mathbf{A}t} = \mathbf{P}\begin{pmatrix} e^{\lambda_1 t} & & \\ & e^{\lambda_2 t} & \\ & & \ddots \end{pmatrix}\mathbf{P}^{-1}$$
When Diagonalization Fails

Not every matrix can be diagonalized! If the geometric multiplicity is less than the algebraic multiplicity for some eigenvalue, we need the Jordan Normal Form instead. However, symmetric matrices are always diagonalizable — and most matrices in data science are symmetric (covariance, correlation, Laplacian).

16. SVD vs Eigendecomposition

SVD (Singular Value Decomposition) and eigendecomposition are closely related but have important differences:

Feature Eigendecomposition SVD
Formula \(\mathbf{A} = \mathbf{P}\mathbf{D}\mathbf{P}^{-1}\) \(\mathbf{A} = \mathbf{U}\mathbf{\Sigma}\mathbf{V}^T\)
Matrix shape Square only (\(n \times n\)) Any shape (\(m \times n\))
Always exists? Not always (need n independent eigenvectors) Always exists for any matrix
Values Eigenvalues (can be negative, complex) Singular values (always non-negative real)
Relationship Eigenvalues of \(\mathbf{A}\) Singular values = \(\sqrt{\text{eigenvalues of } \mathbf{A}^T\mathbf{A}}\)
Key use in DS PCA, Markov chains, spectral methods Recommendation systems, NLP, image compression
Connection

For a symmetric matrix \(\mathbf{A}\): eigendecomposition and SVD are the same thing (up to sign). The singular values equal the absolute values of the eigenvalues.

For a general matrix \(\mathbf{A}\): the left singular vectors are eigenvectors of \(\mathbf{A}\mathbf{A}^T\), and the right singular vectors are eigenvectors of \(\mathbf{A}^T\mathbf{A}\).

17. The Spectral Theorem

One of the most important theorems in linear algebra, and the theoretical backbone of PCA:

Spectral Theorem (Real Symmetric Matrices)

If \(\mathbf{A}\) is a real symmetric matrix (\(\mathbf{A} = \mathbf{A}^T\)), then:

  1. All eigenvalues are real
  2. Eigenvectors corresponding to distinct eigenvalues are orthogonal
  3. \(\mathbf{A}\) can be decomposed as \(\mathbf{A} = \mathbf{Q}\mathbf{D}\mathbf{Q}^T\) where \(\mathbf{Q}\) is orthogonal (\(\mathbf{Q}^{-1} = \mathbf{Q}^T\))

This is why PCA works so cleanly — covariance matrices are symmetric, so their eigenvectors form a perfect orthogonal coordinate system.

Spectral Decomposition (Outer Product Form)

A symmetric matrix can be written as a sum of rank-1 matrices:

$$\mathbf{A} = \lambda_1 \mathbf{v}_1\mathbf{v}_1^T + \lambda_2 \mathbf{v}_2\mathbf{v}_2^T + \cdots + \lambda_n \mathbf{v}_n\mathbf{v}_n^T$$

Each term \(\lambda_i \mathbf{v}_i\mathbf{v}_i^T\) is a projection onto one eigenvector, weighted by its eigenvalue. This form is directly used in PCA: keep the terms with the largest \(\lambda_i\) and discard the rest.

Eigenvalues and eigenvectors are connected to many other concepts. Here's how they all fit together:

ConceptConnection to Eigenvalues/Eigenvectors
Determinant \(\det(\mathbf{A}) = \prod \lambda_i\). If any \(\lambda = 0\), the determinant is 0 and the matrix is singular.
Trace \(\text{tr}(\mathbf{A}) = \sum \lambda_i\). The trace equals the sum of eigenvalues.
Rank Rank = number of non-zero eigenvalues. A rank-deficient matrix has eigenvalue 0.
Inverse Eigenvalues of \(\mathbf{A}^{-1}\) are \(1/\lambda_i\). Exists only if all \(\lambda_i \neq 0\).
Null Space The null space is the eigenspace for \(\lambda = 0\).
Positive Definite A symmetric matrix is positive definite iff ALL eigenvalues > 0. (Covariance matrices are positive semi-definite: all \(\lambda \geq 0\).)
Condition Number \(\kappa(\mathbf{A}) = |\lambda_{\max}|/|\lambda_{\min}|\). Large condition number = numerically unstable.
Matrix Norm The spectral norm \(\|\mathbf{A}\|_2 = \sigma_{\max}\) (largest singular value). For symmetric \(\mathbf{A}\): equals \(|\lambda_{\max}|\).
Fourier Transform The DFT matrix has eigenvectors that are complex exponentials. Fourier analysis IS eigenvalue analysis of circulant matrices.
Cayley-Hamilton Every matrix satisfies its own characteristic equation: if \(p(\lambda) = 0\) is the characteristic polynomial, then \(p(\mathbf{A}) = \mathbf{0}\).

19. All Key Properties & Theorems

Part VI — Practice & Reference

20. Practice Problems (with Solutions)

Problem 1 (Easy) — Diagonal Matrix

Find eigenvalues and eigenvectors of: \(\mathbf{A} = \begin{pmatrix} 3 & 0 \\ 0 & 7 \end{pmatrix}\)

Click to reveal answer

For diagonal matrices, eigenvalues are just the diagonal entries!

$$\lambda_1 = 3,\; \mathbf{v}_1 = \begin{pmatrix}1\\0\end{pmatrix} \qquad \lambda_2 = 7,\; \mathbf{v}_2 = \begin{pmatrix}0\\1\end{pmatrix}$$

Problem 2 (Medium) — Symmetric Matrix

Find eigenvalues and eigenvectors of: \(\mathbf{B} = \begin{pmatrix} 1 & 2 \\ 2 & 1 \end{pmatrix}\)

Click to reveal answer

Characteristic equation: \(\lambda^2 - 2\lambda - 3 = 0 \implies (\lambda-3)(\lambda+1)=0\)

$$\lambda_1 = 3,\; \mathbf{v}_1 = \begin{pmatrix}1\\1\end{pmatrix} \qquad \lambda_2 = -1,\; \mathbf{v}_2 = \begin{pmatrix}1\\-1\end{pmatrix}$$

Note the eigenvectors are orthogonal (dot product = 0), as guaranteed by the Spectral Theorem for symmetric matrices!

Problem 3 (Medium) — Markov Chain

A particle moves between states A and B. From A: 70% stay, 30% go to B. From B: 50% stay, 50% go to A. Find the steady-state distribution.

Click to reveal answer

Transition matrix: \(\mathbf{P} = \begin{pmatrix}0.7 & 0.5 \\ 0.3 & 0.5\end{pmatrix}\)

For \(\lambda = 1\): \((\mathbf{P}-\mathbf{I})\mathbf{v}=\mathbf{0}\) gives \(-0.3v_1 + 0.5v_2 = 0 \implies v_1 = \frac{5}{3}v_2\).

Normalizing: \(\boldsymbol{\pi} = \begin{pmatrix}5/8 \\ 3/8\end{pmatrix} = \begin{pmatrix}0.625 \\ 0.375\end{pmatrix}\)

Long run: 62.5% in state A, 37.5% in state B.

Problem 4 (Challenge) — Complex Eigenvalues

Find eigenvalues of: \(\mathbf{C} = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}\)

Click to reveal answer

\(\lambda^2 + 1 = 0 \implies \lambda = \pm i\) (pure imaginary!)

This is a 90° rotation matrix. No real vector remains on its line after rotation, which is why eigenvalues are complex. Pure imaginary eigenvalues = pure rotation (no growth or decay) = undamped oscillation.

Problem 5 (Challenge) — Prove the Properties

Given \(\mathbf{A}\mathbf{v} = \lambda\mathbf{v}\), prove that \(\mathbf{A}^2\mathbf{v} = \lambda^2\mathbf{v}\).

Click to reveal answer
$$\mathbf{A}^2\mathbf{v} = \mathbf{A}(\mathbf{A}\mathbf{v}) = \mathbf{A}(\lambda\mathbf{v}) = \lambda(\mathbf{A}\mathbf{v}) = \lambda(\lambda\mathbf{v}) = \lambda^2\mathbf{v} \quad \blacksquare$$

By induction, this generalizes to \(\mathbf{A}^k\mathbf{v} = \lambda^k\mathbf{v}\) for all positive integers \(k\).

Problem 6 (Hard) — 3×3 with Repeated Eigenvalue

Find eigenvalues and eigenvectors of: \(\mathbf{A} = \begin{pmatrix} 5 & 4 & 2 \\ 4 & 5 & 2 \\ 2 & 2 & 2 \end{pmatrix}\)

Click to reveal answer

The characteristic polynomial is \(-\lambda^3 + 12\lambda^2 - 21\lambda + 10 = 0\), which factors as \(-(\lambda-10)(\lambda-1)^2 = 0\).

Eigenvalues: \(\lambda_1 = 10\), \(\lambda_2 = 1\) (multiplicity 2).

For \(\lambda = 10\): \(\mathbf{v} = \begin{pmatrix}2\\2\\1\end{pmatrix}\)

For \(\lambda = 1\): Two independent eigenvectors: \(\mathbf{v}_a = \begin{pmatrix}1\\-1\\0\end{pmatrix}\), \(\mathbf{v}_b = \begin{pmatrix}-1\\0\\1\end{pmatrix}\) (any two independent vectors in the 2D eigenspace).

Since geometric = algebraic multiplicity for both, the matrix is diagonalizable. And since \(\mathbf{A}\) is symmetric, the eigenvectors are orthogonal!

21. Common Mistakes to Avoid

Mistake 1: Forgetting \(\mathbf{v} \neq \mathbf{0}\)

The zero vector \(\mathbf{v} = \mathbf{0}\) always satisfies \(\mathbf{A}\mathbf{v} = \lambda\mathbf{v}\) for any \(\lambda\). But it's NOT an eigenvector! Eigenvectors must be non-zero.

Mistake 2: Confusing Eigenvalue 0 with "No Eigenvalue"

\(\lambda = 0\) IS a valid eigenvalue. It means the eigenvector gets mapped to the zero vector: \(\mathbf{A}\mathbf{v} = \mathbf{0}\). This happens when \(\mathbf{A}\) is singular.

Mistake 3: Wrong Determinant Setup

It's \(\det(\mathbf{A} - \lambda\mathbf{I}) = 0\), NOT \(\det(\mathbf{A}) - \lambda = 0\). The \(\lambda\) goes on the diagonal through the identity matrix, then you take the determinant of the whole thing.

Mistake 4: Assuming All Matrices Are Diagonalizable

The matrix \(\begin{pmatrix}1 & 1\\0 & 1\end{pmatrix}\) has \(\lambda = 1\) (double), but only one eigenvector \(\begin{pmatrix}1\\0\end{pmatrix}\). It's NOT diagonalizable. You need Jordan form.

Mistake 5: Non-Square Matrices

Only square matrices have eigenvalues and eigenvectors. For non-square matrices, use SVD instead.

Mistake 6: Thinking Eigenvectors Are Unique

Any scalar multiple of an eigenvector is also an eigenvector. The direction matters, not the length. That's why we often normalize to unit length.

22. Quick Reference — The Recipe

  1. Write out \(\mathbf{A} - \lambda\,\mathbf{I}\)
  2. Compute \(\det(\mathbf{A} - \lambda\,\mathbf{I}) = 0\) → characteristic equation
  3. Solve for \(\lambda\) → these are your eigenvalues
  4. For each \(\lambda\), solve \((\mathbf{A} - \lambda\,\mathbf{I})\,\mathbf{v} = \mathbf{0}\) via row reduction → eigenvectors
  5. Verify by checking \(\mathbf{A}\mathbf{v} = \lambda\,\mathbf{v}\)
  6. Quick sanity check: do eigenvalues sum to trace? Multiply to determinant?

23. Python / NumPy Cheat Sheet

# === Basic Eigenvalue Computation ===
import numpy as np

A = np.array([[4, 1],
              [2, 3]])

# Get eigenvalues and eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(A)

print("Eigenvalues:", eigenvalues)
# Output: [5. 2.]

print("Eigenvectors (columns):")
print(eigenvectors)

# === For Symmetric Matrices (Faster & More Stable) ===
C = np.array([[4, 2],
              [2, 3]])

eigenvalues, eigenvectors = np.linalg.eigh(C)  # 'h' = Hermitian/symmetric
# Returns sorted eigenvalues and orthonormal eigenvectors

# === PCA from Scratch ===
X = np.random.randn(100, 5)  # 100 samples, 5 features
X_centered = X - X.mean(axis=0)
cov_matrix = np.cov(X_centered.T)

evals, evecs = np.linalg.eigh(cov_matrix)
# Sort by descending eigenvalue
idx = np.argsort(evals)[::-1]
evals = evals[idx]
evecs = evecs[:, idx]

# Variance explained
variance_ratio = evals / evals.sum()
print("Variance explained:", variance_ratio)

# Project onto top 2 components
X_pca = X_centered @ evecs[:, :2]

# === Verify: A @ v = lambda * v ===
for i in range(len(eigenvalues)):
    lhs = A @ eigenvectors[:, i]
    rhs = eigenvalues[i] * eigenvectors[:, i]
    print(f"Check λ={eigenvalues[i]:.2f}:",
          np.allclose(lhs, rhs))  # Should print True

# === SVD ===
U, S, Vt = np.linalg.svd(A)
# S contains singular values
# For symmetric A: singular values = |eigenvalues|

# === Useful: Condition Number ===
cond = np.linalg.cond(A)
# = |λ_max| / |λ_min|. Large = numerically unstable


"The theory of eigenvalues is one of the great achievements of mathematics. Virtually every branch of science and engineering uses it."

Master eigenvalues and eigenvectors, and PCA, spectral methods, differential equations, and dynamical systems will feel natural.

© 2026 Sim Vattanac. All rights reserved.