Chapter 5. Inner Product Spaces, Orthogonal Projection, Least Squares, and Singular Value Decomposition
Tải bản đầy đủ - 0trang
5-2
Handbook of Linear Algebra
A Hermitian matrix A is positive definite if x∗ Ax > 0 for all nonzero x ∈ Cn . (See Chapter 8 for more
information on positive definite matrices.)
Facts:
All the following facts except those with a specific reference can be found in [Rom92, pp. 157–164].
1. The vector space Rn is an inner product space under the standard inner product, or dot product,
defined by
n
u, v = uT v =
ui v i .
i =1
This inner product space is often called n–dimensional Euclidean space.
2. The vector space Cn is an inner product space under the standard inner product, defined by
n
u, v = v∗ u =
ui v¯ i .
i =1
This inner product space is often called n-dimensional unitary space.
3. [HJ85, p. 410] In Rn , a function ·, · : Rn × Rn → R is an inner product if and only if there exists
a real symmetric positive definite matrix G such that u, v = uT G v, for all u, v ∈ Rn .
4. [HJ85, p. 410] In Cn , a function ·, · : Cn × Cn → C is an inner product if and only if there exists
a Hermitian positive definite matrix H such that u, v = v∗ Hu, for all u, v ∈ Cn .
5. Let l 2 be the vector space of all infinite complex sequences v = (v n ) with the property that
∞
|v n |2 < ∞. Then l 2 is an inner product space under the inner product
n=1
∞
un v¯ n .
u, v =
n=1
6. The vector space C [a, b] of all continuous real-valued functions on the closed interval [a, b] is an
inner product space under the inner product
f, g =
b
f (x)g (x)d x.
a
7. If V is an inner product space and u, w = v, w for all w ∈ V , then u = v.
8. The inner product on an inner product space V , when restricted to vectors in a subspace S of V ,
is an inner product on S.
9. Let V be an inner product space. Then the norm function · on V has the following basic
properties for all u, v ∈ V :
r
v ≥ 0 and v = 0 if and only if v = 0.
r
av = |a| v , for all a ∈ F .
r (The triangle inequality) u + v
a ∈ F.
≤ u + v with equality if and only if v = au, for some
r (The Cauchy–Schwarz inequality) | u, v | ≤ u
a ∈ F.
r | u − v |≤ u−v .
v with equality if and only if v = au, for some
5-3
Inner Product Spaces, Orthogonal Projection, Least Squares
r (The parallelogram law) u + v 2 + u − v 2 = 2 u 2 + 2 v 2 .
r (Polarization identities)
4 u, v =
⎧
⎨ u+v
2
− u − v 2 , if F = R.
⎩ u+v
2
− u−v
2
+ i u + iv
2
− i u − i v 2 , if F = C.
Examples:
1. Let R4 be the Euclidean space with the inner product u, v = uT v. Let x = [1, 2, 3, 4]T ∈ R4 and
y = [3, −1, 0, 2]T ∈ R4 be two vectors. Then
√
√
r x, y = 9, x = 30, and y = 14.
√
r The distance between x and y is d(x, y) = x − y = 26.
9
r The angle between x and y is θ = arccos √ 9√
= arccos √
≈ 1.116 radians.
30 14
2 105
2. u, v = u1 v 1 + 2u1 v 2 + 2u2 v 1 + 6u2 v 2 = uT
G=
1
2
2
v is an inner product on R2 , as the matrix
6
1 2
is symmetric positive definite.
2 6
3. Let C [−1, 1] be the vector space with the inner product f, g =
and g (x) = x 2 be two functions in C [−1, 1]. Then f, g =
and g , g =
1
−1
1
−1
1
−1
f (x)g (x)d x and let f (x) = 1
x 2 d x = 2/3, f, f =
1
−1
1d x = 2,
√
x 4 d x = 2/5. The angle between f and g is arccos( 5/3) ≈ 0.730 radians.
4. [Mey00, p. 286] A, B = tr(AB ∗ ) is an inner product on Cm×n .
5.2
Orthogonality
Definitions:
Let V be an inner product space. Two vectors u, v ∈ V are orthogonal if u, v = 0, and this is denoted
by u ⊥ v.
A subset S of an inner product space V is an orthogonal set if u ⊥ v, for all u, v ∈ S such that u = v.
A subset S of an inner product space V is an orthonormal set if S is an orthogonal set and each v ∈ S
is a unit vector.
Two subsets S and W of an inner product space V are orthogonal if u ⊥ v, for all u ∈ S and v ∈ W,
and this is denoted by S ⊥ W.
The orthogonal complement of a subset S of an inner product space V is S ⊥ = {w ∈ V | w, v =
0 for all v ∈ S}.
A complete orthonormal set M in an inner product space V is an orthonormal set of vectors in V such
that for v ∈ V , v ⊥ M implies that v = 0.
An orthogonal basis for an inner product space V is an orthogonal set that is also a basis for V .
An orthonormal basis for V is an orthonormal set that is also a basis for V .
A matrix U is unitary if U ∗ U = I .
A real matrix Q is orthogonal if Q T Q = I .
5-4
Handbook of Linear Algebra
Facts:
1. [Mey00, p. 298] An orthogonal set of nonzero vectors is linearly independent. An orthonormal set
of vectors is linearly independent.
2. [Rom92, p. 164] If S is a subset of an inner product space V , then S ⊥ is a subspace of V . Moreover,
if S is a subspace of V , then S ∩ S ⊥ = {0}.
3. [Mey00, p. 409] In an inner product space V , {0}⊥ = V and V ⊥ = {0}.
4. [Rom92, p. 168] If S is a finite dimensional subspace of an inner product space V , then for any
v ∈ V,
r There are unique vectors s ∈ S and t ∈ S ⊥ such that v = s + t. This implies V = S ⊕ S ⊥ .
r There is a unique linear operator P such that P (v) = s.
5. [Mey00, p. 404] If S is a subspace of an n−dimensional inner product space V , then
r (S ⊥ )⊥ = S.
r dim(S ⊥ ) = n − dim(S).
6. [Rom92, p. 174] If S is a subspace of an infinite dimensional inner product space, then S ⊆ (S ⊥ )⊥ ,
but the two sets need not be equal.
7. [Rom92, p. 166] An orthonormal basis is a complete orthonormal set.
8. [Rom92, p. 166] In a finite-dimensional inner product space, a complete orthonormal set is a basis.
9. [Rom92, p. 165] In an infinite-dimensional inner product space, a complete orthonormal set may
not be a basis.
10. [Rom92, p. 166] Every finite-dimensional inner product space has an orthonormal basis.
11. [Mey00, p. 299] Let B = {u1 , u2 , . . . , un } be an orthonormal basis for V . Every vector v ∈ V can
be uniquely expressed as
n
v, ui ui .
v=
i =1
The expression on the right is called the Fourier expansion of v with respect to B and the scalars
v, ui are called the Fourier coefficients.
12. [Mey00, p. 305] (Pythagorean Theorem) If {vi }ik=1 is an orthogonal set of vectors in V , then
k
2
= ik=1 vi 2 .
i =1 vi
13. [Rom92, p. 167] (Bessel’s Inequality) If {ui }ik=1 is an orthonormal set of vectors in V , then
v 2 ≥ ik=1 | v, ui |2 .
14. [Mey00, p. 305] (Parseval’s Identity) Let B = {u1 , u2 , . . . , un } be an orthonormal basis for V . Then
n
for each v ∈ V , v
2
=
| v, ui |2 .
i =1
m×n
15. [Mey00, p. 405] Let A ∈ F
, where F = R or C. Then
r ker(A)⊥ = range(A∗ ), range(A)⊥ = ker(A∗ ).
r F m = range(A) ⊕ range(A)⊥ = range(A) ⊕ ker(A∗ ).
r F n = ker(A) ⊕ ker(A)⊥ = ker(A) ⊕ range(A∗ ).
16. [Mey00, p. 321] (See also Section 7.1.) The following statements for a real matrix Q ∈ Rn×n are
equivalent:
r Q is orthogonal.
r Q has orthonormal columns.
r Q has orthonormal rows.
r Q Q T = I , where I is the identity matrix of order n.
r For all v ∈ Rn , Qv = v .
5-5
Inner Product Spaces, Orthogonal Projection, Least Squares
17. [Mey00, p. 321] (See also Section 7.1.) The following statements for a complex matrix U ∈ Cn×n
are equivalent:
r U is unitary.
r U has orthonormal columns.
r U has orthonormal rows.
r U U ∗ = I , where I is the identity matrix of order n.
r For all v ∈ Cn , U v = v .
Examples:
1. Let C [−1, 1] be the vector space with the inner product f, g =
and g (x) = x be two functions in C [−1, 1]. Then f, g =
1
−1
1
−1
f (x)g (x)d x and let f (x) = 1
xd x = 0. Thus, f ⊥ g .
2. The standard basis {e1 , e2 , . . . , en } is an orthonormal basis for the unitary space Cn .
3. If {v1 , v2 , · · · , vn } is an orthogonal basis for Cn and S = span {v1 , v2 , · · · , vk } (1 ≤ k ≤ n − 1), then
S ⊥ = span {vk+1 , · · · , vn }.
They
4. The vectors v1 = [2, 2, 1]T , v2 = [1, −1, 0]T , and v3 = [−1, −1, 4]T are mutually
√ orthogonal.
√
T
T
=
v
/
v
=
[2/3,
2/3,
1/3]
,
u
=
v
/
v
=
[1/
2,
−1/
2,
0]
, and
can be normalized to u√
1
1 √1
2
2
2
√
u3 = v3 / v3 = [− 2/6, − 2/6, 2 2/3]T . The set B = {u1 , u2 , u3 } forms an orthonormal
basis for the Euclidean space R3 .
r If v = [v , v , v ]T ∈ R3 , then v = v, u u + v, u u + v, u u , that is,
1 2 3
1 1
2 2
3 3
v=
2v 1 + 2v 2 + v 3
v1 − v2
−v 1 − v 2 + 4v 3
√
u3 .
u1 + √ u2 +
3
2
3 2
r The matrix Q = [u , u , u ] ∈ R3×3 is an orthogonal matrix.
1 2 3
5. Let S be the subspace of C3 spanned by the vectors u = [i, 1, 1]T and v = [1, i, 1]T . Then the
orthogonal complement of S is
S ⊥ = {w|w = α[1, 1, −1 + i ]T , where α ∈ C}.
6. Consider the inner product space l 2 from Fact 5 in Section 5.1. Let E = {ei |i = 1, 2, . . .}, where
ei has a 1 on i th place and 0s elsewhere. It is clear that E is an orthonormal set. If v = (v n ) ⊥ E,
then for each n, v n = v, en = 0. This implies v = 0. Therefore, E is a complete orthonormal set.
However, E is not a basis for l 2 as S = span{E} = l 2 . Further, S ⊥ = {0}. Thus, (S ⊥ )⊥ = l 2 ⊆ S
and l 2 = S ⊕ S ⊥ .
5.3
Adjoints of Linear Operators on Inner Product Spaces
Let V be a finite dimensional (real or complex) inner product space and let T be a linear operator on V .
Definitions:
A linear operator T ∗ on V is called the adjoint of T if T (u), v = u, T ∗ (v) for all u, v ∈ V .
The linear operator T is self-adjoint, or Hermitian, if T = T ∗ ; T is unitary if T ∗ T = I V .
5-6
Handbook of Linear Algebra
Facts:
The following facts can be found in [HK71].
1. Let f be a linear functional on V . Then there exists a unique v ∈ V such that f (w) = w, v for
all w ∈ V .
2. The adjoint T ∗ of T exists and is unique.
3. Let B = (u1 , u2 , . . . , un ) be an ordered, orthonormal basis of V . Let A = [T ]B . Then
ai j = T (uj ), ui ,
i, j = 1, 2, . . . , n.
Moreover, [T ∗ ]B = A∗ , the Hermitian adjoint of A.
4. (Properties of the adjoint operator)
(a) (T ∗ )∗ = T for every linear operator T on V .
(b) (aT )∗ = a¯ T ∗ for every linear operator T on V and every a ∈ F .
(c) (T + T1 )∗ = T ∗ + T1 ∗ for every linear operators T, T1 on V .
(d) (T T1 )∗ = T1 ∗ T ∗ for every linear operators T, T1 on V .
5. Let B be an ordered orthonormal basis of V and let A = [T ]B . Then
(a) T is self-adjoint if and only if A is a Hermitian matrix.
(b) T is unitary if and only if A is a unitary matrix.
Examples:
1. Consider the space R3 equipped with the standard inner product and let f (w) = 3w 1 − 2w 3 .
Then with v = [3, 0, −2]T , f (w) = w, v .
⎡ ⎤
x
⎢ ⎥
2. Consider the space R3 equipped with the standard inner product. Let v = ⎣ y ⎦ and T (v) =
z
⎡
⎤
⎡
2x + y
2
⎢
⎥
⎢
⎣ y − 3z ⎦. Then [T ] = ⎣0
x+y+z
1
1
1
1
⎤
⎡
0
2
⎥
⎢
−3⎦ , so [T ]∗ = ⎣1
1
0
0
1
−3
⎤
⎡
⎤
1
2x + z
⎥
⎢
⎥
1⎦, and T ∗ (v) = ⎣ x + y + z ⎦.
1
−3y + z
3. Consider the space Cn×n equipped with the inner product in Example 4 of section 5.1. Let A, B ∈
Cn×n and let T be the linear operator on Cn×n defined by T (X) = AX + X B, X ∈ Cn×n . Then
T ∗ (X) = A∗ X + X B ∗ , X ∈ Cn×n .
4. Let V be an inner product space and let T be a linear operator on V . For a fixed u ∈ V , f (w) =
T (w), u is a linear functional. By Fact 1, there is a unique vector v such that f (w) = w, v . Then
T ∗ (u) = v.
5.4
Orthogonal Projection
Definitions:
Let S be a finite-dimensional subspace of an inner product space V . Then according to Fact 4 in Section 5.2,
each v ∈ V can be written uniquely as v = s + t, where s ∈ S and t ∈ S ⊥ . The vector s is called the
orthogonal projection of v onto S and is often written as Proj S v, where the linear operator Proj S is called
the orthogonal projection onto S along S ⊥ . When V = Cn or V = Rn with the standard inner product,
the linear operator Proj S is often identified with its standard matrix [Proj S ] and Proj S is used to denote
both the operator and the matrix.
Inner Product Spaces, Orthogonal Projection, Least Squares
5-7
Facts:
1. An orthogonal projection is a projection (as defined in Section 3.6).
2. [Mey00, p. 433] Suppose that P is a projection. The following statements are equivalent:
r P is an orthogonal projection.
r P∗ = P.
r range(P ) ⊥ ker(P ).
3. [Mey00, p. 430] If S is a subspace of a finite dimensional inner product space V , then
Proj S ⊥ = I − Proj S .
4. [Mey00, p. 430] Let S be a p–dimensional subspace of the standard inner product space Cn , and
let the columns of matrices M ∈ Cn× p and N ∈ Cn×(n− p) be bases for S and S ⊥ , respectively. Then
the orthogonal projections onto S and S ⊥ are
Proj S = M(M ∗ M)−1 M ∗
and Proj S ⊥ = N(N ∗ N)−1 N ∗ .
If M and N contain orthonormal bases for S and S ⊥ , then Proj S = M M ∗ and Proj S ⊥ = N N ∗ .
5. [Lay03, p. 399] If {u1 , . . . , u p } is an orthonormal basis for a subspace S of Cn , then for any v ∈ Cn ,
Proj S v = (u∗1 v)u1 + · · · + (u∗p v)u p .
6. [TB97, p. 46] Let v ∈ Cn be a nonzero vector. Then
∗
r Proj = vv is the orthogonal projection onto the line L = span{v}.
v
∗
v v
∗
r Proj = I − vv is the orthogonal projection onto L ⊥ .
⊥v
v∗ v
7. [Mey00, p. 435] (The Best Approximation Theorem) Let S be a finite dimensional subspace of an
inner product space V and let b be a vector in V . Then Proj S b is the unique vector in S that is
closest to b in the sense that
min b − s = b − Proj S b .
s∈S
The vector Proj S b is called the best approximation to b by the elements of S.
Examples:
1. Generally, an orthogonal projection P ∈ Cn×n is not a unitary matrix.
2. Let {v1 , v2 , · · · , vn } be an orthogonal basis for Rn and let S the subspace of Rn spanned by
{v1 , · · · , vk }, where 1 ≤ k ≤ n − 1. Then w = c 1 v1 + c 2 v2 + · · · + c n vn ∈ Rn can be written as w = s + t, where s = c 1 v1 + · · · + c k vk ∈ S and t = c k+1 vk+1 + · · · + c n vn ∈ S ⊥ .
3. Let u1 = [2/3, 2/3, 1/3]T , u2 = [1/3, −2/3, 2/3]T , and x = [2, 3, 5]T . Then {u1 , u2 } is an orthonormal basis for the subspace S = span {u1 , u2 } of R3 .
r The orthogonal projection of x onto S is
Proj S x = u1T xu1 + u2T x u2 = [4, 2, 3]T .
r The orthogonal projection of x onto S ⊥ is y = x − Proj x = [−2, 1, 2]T .
S
r The vector in S that is closest to x is Proj x = [4, 2, 3]T .
S
5-8
Handbook of Linear Algebra
r Let M = [u , u ]. Then the orthogonal projection onto S is
1 2
⎡
5
1⎢
Proj S = M M T = ⎣2
9
4
⎤
2
4
⎥
8 −2⎦ .
−2
5
r The orthogonal projection of any v ∈ R3 onto S can be computed by Proj v = M M T v. In
S
particular, M M T x = [4, 2, 3]T .
4. Let w1 = [1, 1, 0]T and w2 = [1, 0, 1]T . Consider the subspace W = span{w1 , w2 } of R3 . Define
⎡
⎤
1
⎢
the matrix M = [w1 , w2 ] = ⎣1
0
1
2
⎥
0⎦. Then M T M =
1
1
1
.
2
r The orthogonal projection onto W is Proj = M(M T M)−1 M T =
W
⎡
1
⎢
⎣1
0
⎤
1
⎥ 2
0⎦
1
1
1
2
−1
⎡
1
1
1
0
2
1⎢
0
= ⎣1
1
3
1
1
2
−1
⎤
1
⎥
−1⎦ .
2
r The orthogonal projection of any v ∈ R3 onto W can be computed by Proj v. For v = [1, 2, 3]T ,
W
ProjW v = ProjW [1, 2, 3]T = [7/3, 2/3, 5/3]T .
5.5
Gram−Schmidt Orthogonalization and QR Factorization
Definitions:
Let {a1 , a2 , . . . , an } be a basis for a subspace S of an inner product space V . An orthonormal basis
{u1 , u2 , . . . , un } for S can be constructed using the following Gram–Schmidt orthogonalization process:
u1 =
a1
a1
and uk =
ak −
ak −
k−1
i =1
k−1
i =1
ak , ui ui
ak , ui ui
,
for k = 2, . . . , n.
ˆ where Qˆ ∈ Cm×n has
A reduced QR factorization of A ∈ Cm×n (m ≥ n) is a factorization A = Qˆ R,
n×n
is an upper triangular matrix.
orthonormal columns and Rˆ ∈ C
A QR factorization of A ∈ Cm×n (m ≥ n) is a factorization A = Q R, where Q ∈ Cm×m is a unitary
matrix and R ∈ Cm×n is an upper triangular matrix with the last m − n rows of R being zero.
Facts:
1. [TB97, p. 51] Each A ∈ Cm×n (m ≥ n) has a full Q R factorization A = Q R. If A ∈ Rm×n , then
both Q and R may be taken to be real.
ˆ If A ∈ Rm×n ,
2. [TB97, p. 52] Each A ∈ Cm×n (m ≥ n) has a reduced Q R factorization A = Qˆ R.
then both Qˆ and Rˆ may be taken to be real.
ˆ
3. [TB97, p. 52] Each A ∈ Cm×n (m ≥ n) of full rank has a unique reduced Q R factorization A = Qˆ R,
where Qˆ ∈ Cm×n and Rˆ ∈ Cn×n with real r ii > 0.
4. [TB97, p. 48] The orthonormal basis {u1 , u2 , . . . , un } generated via the Gram–Schmidt orthogonalization process has the property
Span({u1 , u2 , . . . , uk }) = Span({a1 , a2 , . . . , ak }),
for k = 1, 2, . . . , n.
Inner Product Spaces, Orthogonal Projection, Least Squares
5-9
5. [TB97, p. 51]
Algorithm 1: Classical Gram–Schmidt Orthogonalization:
input: a basis {a1 , a2 , . . . , an } for a subspace S
output: an orthonormal basis {u1 , u2 , . . . , un } for S
for j = 1 : n
u j := a j
for i = 1 : j − 1
r i j := a j , ui
u j := u j − r i j ui
end
r j j := u j
u j := u j /r j j
end
6. [TB97, p. 58]
Algorithm 2: Modified Gram–Schmidt Orthogonalization
input: a basis {a1 , a2 , . . . , an } for a subspace S
output: an orthonormal basis {u1 , u2 , . . . , un } for S
wi := ai , i = 1 : n
for i = 1 : n
r ii := wi
ui := wi /r ii
for j = i + 1 : n
r i j := w j , ui
w j := w j − r i j ui
end
end
7. [Mey00, p. 315] If exact arithmetic is used, then Algorithms 1 and 2 generate the same orthonormal basis {u1 , u2 , . . . , un } and the same r i j , for j ≥ i .
8. [GV96, pp. 230–232] If A = [a1 , a2 , . . . , an ] ∈ Cm×n (m ≥ n) is of full rank n, then the classic
ˆ with Qˆ =
or modified Gram–Schmidt process leads to a reduced QR factorization A = Qˆ R,
[u1 , u2 , . . . , un ] and Rˆ i j = r i j , for j ≥ i , and Rˆ i j = 0, for j < i .
9. [GV96, p. 232] The costs of Algorithm 1 and Algorithm 2 are both 2mn2 flops when applied to
compute a reduced QR factorization of a matrix A ∈ Rm×n .
10. [Mey00, p. 317 and p. 349] For the QR factorization, Algorithm 1 and Algorithm 2 are not numerically stable. However, Algorithm 2 often yields better numerical results than Algorithm 1.
11. [Mey00, p. 349] Algorithm 2 is numerically stable when it is used to solve least squares problems.
12. (Numerically stable algorithms for computing the QR factorization using Householder reflections
and Givens rotations are given in Chapter 38.)
13. [TB97, p. 54] (See also Chapter 38.) If A = Q R is a QR factorization of the rank n matrix A ∈ Cn×n ,
then the linear system Ax = b can be solved as follows:
r Compute the factorization A = Q R.
r Compute the vector c = Q ∗ b.
r Solve Rx = c by performing back substitution.
5-10
Handbook of Linear Algebra
Examples:
⎡
⎤
1
⎢
1. Consider the matrix A = ⎣2
0
2
⎥
0 ⎦.
2
r A has a (full) QR factorization A = Q R:
⎡
1
2
⎢
⎢2
⎣
⎤
⎡
⎥
0⎥
⎦=
0
√1
5
⎢ 2
⎢√
⎣ 5
2
0
4
√
3 5
− 3√2 5
√
5
3
− 23
⎤ ⎡√
⎢
1⎥
⎥⎢
3⎦⎣
2
3
0
⎤
√2
5
⎥
√6 ⎥ .
5⎦
0
0
5
r A has a reduced QR factorization A = Q
ˆ R:
ˆ
⎡
1
2
0
2
⎢
⎢2
⎣
⎤
⎡
⎥
0⎥
⎦=
⎡
√1
⎢ 25
⎢√
⎣ 5
0
⎤
4
⎡√
√
3 5
5
⎥
2
⎣
− 3√5 ⎥
⎦
0
√
5
3
⎤
√2
5⎦
.
√6
5
⎤
3
1 −2
⎢
1⎥
⎢3 −4
⎥
2. Consider the matrix A = ⎢
⎥. Using the classic or modified Gram–Schmidt process
⎣3 −4 −1⎦
3
1
0
gives the following reduced QR factorization:
⎡
⎤
⎡1
2
3
1 −2
1
⎢
⎥ ⎢
⎢
3
−4
1
⎢
⎥ ⎢2
⎢
⎥= 1
⎣3 −4 −1⎦ ⎢
⎣2
3
1
0
1
2
5.6
1
2
− 12
− 12
1
2
⎤
− 12 ⎡
1⎥ 6
⎥
2⎥⎢
0
1⎥⎣
−2⎦ 0
1
2
⎤
−3 −1
⎥
5 −1⎦ .
0
2
Singular Value Decomposition
Definitions:
A singular value decomposition (SVD) of a matrix A ∈ Cm×n is a factorization
A = U V ∗,
= diag(σ1 , σ2 , . . . , σ p ) ∈ Rm×n , p = min{m, n},
where σ1 ≥ σ2 ≥ . . . ≥ σ p ≥ 0 and both U = [u1 , u2 , . . . , um ] ∈ Cm×m and V = [v1 , v2 , . . . , vn ] ∈ Cn×n
are unitary. The diagonal entries of are called the singular values of A. The columns of U are called left
singular vectors of A and the columns of V are called right singular vectors of A.
Let A ∈ Cm×n with rank r ≤ p = min{m, n}. A reduced singular value decomposition (reduced SVD)
of A is a factorization
A = Uˆ ˆ Vˆ ∗ , ˆ = diag(σ1 , σ2 , . . . , σr ) ∈ Rr ×r ,
where σ1 ≥ σ2 ≥ . . . ≥ σr > 0 and the columns of Uˆ = [u1 , u2 , . . . , ur ] ∈ Cm×r and the columns of
Vˆ = [v1 , v2 , . . . , vr ] ∈ Cn×r are both orthonormal.
(See §8.4 and §3.7 for more information on singular value decomposition.)
5-11
Inner Product Spaces, Orthogonal Projection, Least Squares
Facts:
All the following facts except those with a specific reference can be found in [TB97, pp. 25–37].
1. Every A ∈ Cm×n has a singular value decomposition A = U V ∗ . If A ∈ Rm×n , then U and V
may be taken to be real.
2. The singular values of a matrix are uniquely determined.
3. If A ∈ Cm×n has a singular value decomposition A = U V ∗ , then
A∗ u j = σ j v j ,
Av j = σ j u j ,
for j = 1, 2, . . . , p = min{m, n}.
4. If U V ∗ is a singular value decomposition of A, then V
of A∗ .
5. If A ∈ Cm×n has r nonzero singular values, then
u∗j Av j = σ j ,
T
U ∗ is a singular value decomposition
r rank(A) = r .
r A=
r
σ j u j v∗j .
j =1
r ker(A) = span{v , . . . , v }.
r +1
n
r range(A) = span{u , . . . , u }.
1
r
6. Any A ∈ Cm×n of rank r ≤ p = min{m, n} has a reduced singular value decomposition,
A = Uˆ ˆ Vˆ ∗ , ˆ = diag(σ1 , σ2 , . . . , σr ) ∈ Rr ×r ,
7.
8.
9.
10.
where σ1 ≥ σ2 ≥ · · · ≥ σr > 0 and the columns of Uˆ = [u1 , u2 , . . . , ur ] ∈ Cm×r and the columns
of Vˆ = [v1 , v2 , . . . , vr ] ∈ Cn×r are both orthonormal. If A ∈ Rm×n , then Uˆ and Vˆ may be taken
to be real.
If rank(A) = r , then A has r nonzero singular values.
The nonzero singular values of A are the square roots of the nonzero eigenvalues of A∗ A or AA∗ .
[HJ85, p. 414] If U V ∗ is a singular value decomposition of A, then the columns of V are
eigenvectors of A∗ A; the columns of U are eigenvectors of AA∗ .
[HJ85, p. 418] Let A ∈ Cm×n and p = min{m, n}. Define
G=
0
A∗
A
∈ C(m+n)×(m+n) .
0
If the singular values of A are σ1 , . . . , σ p , then the eigenvalues of G are σ1 , . . . , σ p , −σ1 , . . . , −σ p
and additional |n − m| zeros.
11. If A ∈ Cn×n is Hermitian with eigenvalues λ1 , λ2 , · · · , λn , then the singular values of A are
|λ1 |, |λ2 |, · · · , |λn |.
12. For A ∈ Cn×n , |det A| = σ1 σ2 · · · σn .
13. [Aut15; Sch07] (Eckart–Young Low Rank Approximation Theorem)
Let A = U V ∗ be an SVD of A ∈ Cm×n and r = rank(A). For k < r , define Ak = kj =1 σ j u j v∗j .
Then
r
r
A − Ak
2
=
A − Ak
F
=
min
A− B
min
A− B
rank(B)≤k
2
= σk+1 ;
r
rank(B)≤k
F
=
σ j2 ,
j =k+1
m
where M
2
= max ||Mx||2 and M
||x||2 =1
F
n
=
mi2j are the 2-norm and Frobenius norm of
i =1 j =1
matrix M, respectively. (See Chapter 37 for more information on matrix norms.)
5-12
Handbook of Linear Algebra
Examples:
⎡
⎤
1
⎢
Consider the matrices A = ⎣2
0
2
1
⎥
0⎦ and B = AT =
2
2
1. The eigenvalues of AT A =
5
2
⎡
3. u1 =
1
3
Av1 =
⎡
and u2 =
⎡
u1 , u2 , and e1 produces u3 =
0
.
2
2
are 9 and 4. So, the singular values of A are 3 and 2.
8
2. Normalized eigenvectors for AT A are v1 =
√ ⎤
5
⎢ 23 ⎥
⎢ √ ⎥
⎣3 5⎦
4
√
3 5
2
0
1
2
Av2 =
⎤
√1
5
√2
5
0
and v2 =
⎤
⎢ √2 ⎥
⎣ 5 ⎦.
− √15
√2
5
− √15
.
Application of the Gram–Schmidt process to
2
⎢ 31 ⎥
⎢− ⎥ .
⎣ 3⎦
− 23
4. A has the singular value decomposition A = U V T , where
⎡
5
0
1 ⎢
U = √ ⎣2
6
3 5 4 −3
√ ⎤
2√5
⎥
−√5⎦ ,
−2 5
⎡
3
⎢
= ⎣0
0
⎤
0
1 1
⎥
2⎦ , V = √
5 2
0
2
.
−1
5. A has the reduced singular value decomposition A = Uˆ ˆ Vˆ T , where
⎡
5
1 ⎢
Uˆ = √ ⎣2
3 5 4
⎤
0
3
⎥
6⎦ , ˆ =
0
−3
6. B has the singular value decomposition B = U B
1 1
U B = VA = √
5 2
2
,
−1
B
=
3
0
0
2
1 1
0
, Vˆ = √
2
5 2
T
B VB ,
2
.
−1
where
⎡
5
1 ⎢
0
, VB = U A = √ ⎣2
0
3 5 4
0
6
−3
√ ⎤
2√5
⎥
−√5⎦ .
−2 5
(U A = U and VA = V for A were given in Example 4.)
5.7
Pseudo-Inverse
Definitions:
A Moore–Penrose pseudo-inverse of a matrix A ∈ Cm×n is a matrix A† ∈ Cn×m that satisfies the following
four Penrose conditions:
AA† A = A; A† AA† = A† ; (AA† )∗ = AA† ; (A† A)∗ = A† A.
Facts:
All the following facts except those with a specific reference can be found in [Gra83, pp. 105–141].
1. Every A ∈ Cm×n has a unique pseudo-inverse A† . If A ∈ Rm×n , then A† is real.