Lecture 18

[[lecture-data]]

2024-10-09

Readings

4. Chapter 4

Interlacing II / Inclusion Principle

Suppose $A \in M_{n}$ is hermitian and $B$ is an $r \times r$ principal submatrix. Then for all $k$ ,

λ_{k} (A) \leq λ_{k} (B) \leq λ_{k + n - r} (A)

Proof (via Courant-Fisher)

Say $B$ comes from $A$ , deleting rows and columns $i_{1}, i_{2}, \dots, i_{n - r}$ .

\begin{aligned} λ_{k} (A) & = max_{y_{1}, y_{2}, \dots, y_{k - 1} \in C^{n}} min_{x \in C^{n} \neq 0, x ⊥ y_{i} \forall i} \frac{x^{*} A x}{x^{*} x} \\ = max_{y_{1}, \dots, y_{k - 1}} min_{x ⊥ y_{i} \forall i, x ⊥ e_{i_{1}}, \dots, e_{i_{n - r}}} \frac{x^{*} A x}{x^{*} x} \\ (*) & = max_{v_{1}, v_{2}, \dots, v_{k - 1} \in C^{r}} min_{z \in C^{r} \neq 0, z ⊥ v_{1}, \dots, v_{k - 1}} \frac{z^{*} B z}{z^{*} z} = λ_{k} (B) by Courant-Fischer \end{aligned}

Note that the $e_{i_{k}}$ are the standard basis vectors with the $1$ in the index of each $i_{k}$
The $z$ s are the $x$ s without the $i_{k}$ components (since they are orthogonal to those standard basis vectors)
We can then "perform surgery" on the $y$ s also to get rid of those coordinates to get $v_{1}, \dots, v_{k - 1} \in C^{r}$

(see inclusion principle)

Corollary

Suppose $A \in M_{n}$ hermitian and $\hat{A}$ is a $n - 1 \times n - 1$ principal submatrix. Then

λ_{1} (A) \leq λ_{1} (\hat{A}) \leq λ_{2} (A) \leq λ_{2} (\hat{A}) \leq λ_{3} (A) \leq \dots \leq λ_{k} (\hat{A}) \leq λ_{k} (A)

(see the eigenvalues of a hermitian matrix and a principal submatrix one smaller alternate in magnitude)

Corollary

For any $A \in M_{n}$ hermitian, $\forall i$ we have $λ_{1} (A) \leq a_{i i} \leq λ_{n} (A)$ .

This follows immediately from the inclusion principle, since each diagonal entry is a $1 \times 1$ principal submatrix

(see the diagonal entries of a hermitian matrix are bounded by the eigenvalues)

Majorization

Let $x, y \in R^{n}$ . Say we can order the components of $x$ so that $x_{ℓ_{1}} \leq \dots \leq x_{ℓ_{n}}$ and the same for the components of $y$ so we have $y_{k_{1}} \leq \dots \leq y_{k_{n}}$ . We say $x$ majorizes $y$ precisely when

For all $m, \sum_{i = 1}^{m} x_{ℓ_{i}} \geq \sum_{i = 1}^{m} y_{k_{i}}$ and
equality holds when $m = n$

(see majorization)

Theorem

Let $A \in M_{n}$ be hermitian. Then the vector of diagonal elements of $A$ , call it $diag A$ majorizes the vector of ordered eigenvalues of $A$ , call it $Λ (A)$

Proof via induction on

A

Any case where $n = 1$ is trivially true. Assume the theorem holds for all matrices of size $n$ up to some fixed $k - 1$ . Consider the case when $n = k$ .

Let $B$ be a submatrix of $A$ obtained by deleting one row and corresponding column $ℓ_{k}$ where the diagonals of $A$ are ordered $a_{ℓ_{1} ℓ_{1}} \leq a_{ℓ_{2} ℓ_{2}} \leq \dots \leq a_{ℓ_{k} ℓ_{k}}$ .

For all $n = 1, \dots, k - 1$ , we have $\sum_{i = 1}^{k} λ_{i} (A) \leq \sum_{i = 1}^{k} λ_{i} (B)$ by interlacing 2. Then by the induction hypothesis, we have that $\sum_{i = 1}^{n} λ_{i} (B) \leq \sum_{i = 1}^{n} a_{ℓ_{i} ℓ_{i}}$ . But for the case when $n = k$ , we have that

\sum_{i = 1}^{n} λ_{i} (A) = Tr (A) = \sum_{i = 1}^{n} a_{ᵢ}

Thus the theorem holds

(see diagonal elements of a hermitian matrix majorize its eigenvalues)

Corollary

Let $A \in M_{n}$ be hermitian and $r : 1 \leq r \leq n \in Z$ . Then

\sum_{k = 1}^{r} λ_{k} (A) = min_{U \in M_{n, r} orthonormal cols} Tr U^{*} A U

and also

\sum_{k = 0}^{r - 1} λ_{n - k} (A) = max_{U \in M_{n, r} orthonormal cols} Tr U^{*} A U

And claim that this implies the previous result ( just take take $U$ as the identity )

Proof / Intuition

Given any $U \in M_{n}$ with orthonormal columns, extend Gram-Schmidt to get $V = [U | *] \in M_{n}$ unitary. Then when we do the multiplication

[V^{*} A V]_{r \times r} = U^{*} A U

So by interlacing 2, we get that

λ_{k} (A) = λ_{k} (V^{*} A V) \leq λ_{k} (U^{*} A U)

So sum over $k = 1, 2, \dots, r$ to get

\sum_{k = 1}^{r} λ_{k} (A) \leq Tr (U^{*} A U)

So we have the desired lower bound, and need to show we can achieve equality.

Let $W$ be the orthonormalized eigenvectors associated with $λ_{1} (A), \dots, λ_{r} (A)$ .

Tr U^{*} A U = Tr W^{*} A W = T r D_{r}

(the sum of the first $r$ eigenvalues)

(see the sum of the first least eigenvalues is the minimum of the trace of orthonormal multiplications)