Introduction 
Matrix Action 
Perpframes, Aligners and Hangers 
Stretchers 
Coordinates 
Projections 
SVD 
Matrix Subspaces  
Linear Systems, Pseudo-Inverse 
Condition Number 
Matrix Norm, Rank One 
Data Compression 
Noise Filtering 
Todd Will
UW-La Crosse
 Two-thirds Theorem 
Good Basis Theorem

SVD

SVD 
Exercises

Singular Value Decomposition

The singular value decomposition for a matrix A writes A as a product (hanger)(stretcher)(aligner). 
 

It's an amazing and useful fact that every m x n matrix has a singular value decomposition. 

The following theorem goes two-thirds of the way to proving this fact: 


Two-thirds Theorem

For an [Graphics:svdgr1.gif] matrix [Graphics:svdgr2.gif] and any orthonormal basis [Graphics:svdgr3.gif] of [Graphics:svdgr4.gif]

define [Graphics:svdgr5.gif] 

and 

[Graphics:svdgr6.gif]

Then [Graphics:svdgr7.gif]

Proof: Using first the row way and then the column way to multiply a matrix times a point, you see that the right hand side of the equation sends [Graphics:svdgr8.gif] to [Graphics:svdgr9.gif]

Thus the two sides of the equation agree on the basis [Graphics:svdgr10.gif] and so must be equal. 


The two-thirds theorem gets you two-thirds of the way to the SVD. 

It says that given any orthonormal basis [Graphics:svdgr11.gif] of [Graphics:svdgr12.gif] you can write 

[Graphics:svdgr13.gif] 
[Graphics:svdgr14.gif]
 

So you've got the stretcher and the aligner -- if [Graphics:svdgr15.gif] were a hanger matrix then this would be a Singular Value Decomposition for A

For [Graphics:svdgr16.gif] to be a hanger matrix requires that the columns [Graphics:svdgr17.gif] be pairwise perpendicular. 

So one challenge to finding an SVD for A is to find an orthonormal basis of [Graphics:svdgr18.gif][Graphics:svdgr19.gif] so that for all [Graphics:svdgr20.gif][Graphics:svdgr21.gif]


Theorem: If A is an m x n  matrix, then there is an orthonormal basis of [Graphics:svdgr22.gif][Graphics:svdgr23.gif] so that for all [Graphics:svdgr24.gif][Graphics:svdgr25.gif].

Proof 1: For [Graphics:svdgr26.gif] matrices

You can get 2D perpframes using [Graphics:svdgr27.gif] and specifying an angle [Graphics:svdgr28.gif]

You'd like to show that for any m x 2 matrix there's an angle t so that 
[Graphics:svdgr29.gif] is zero. 

How do you know there is always such a [Graphics:svdgr30.gif]

Answer:  Go with the example matrix [Graphics:svdgr31.gif] and look at the following plot that shows the perpframe [Graphics:svdgr32.gif] and [Graphics:svdgr33.gif] before and after the hit with A. 

Notice that when [Graphics:svdgr34.gif], the angle between [Graphics:svdgr35.gif]and [Graphics:svdgr36.gif] is greater than [Graphics:svdgr37.gif]

By the time [Graphics:svdgr38.gif], the angle is less than [Graphics:svdgr39.gif]

That's enough to guarantee that somewhere between [Graphics:svdgr40.gif] and [Graphics:svdgr41.gif] there's an angle where [Graphics:svdgr42.gif]and [Graphics:svdgr43.gif]are perpendicular. 

Most reasonable folks would accept the evidence given in the plots, but the doubters might want to look at the function: 

[Graphics:svdgr44.gif] 
 

Then [Graphics:svdgr45.gif] 

(1) The definition of f. 

(2) Linearity of matrix hits. 

(3) The definition of f. 
 

This equation says that [Graphics:svdgr46.gif]and [Graphics:svdgr47.gif]have opposite signs. This should convince even the doubters that there's a [Graphics:svdgr48.gif] between [Graphics:svdgr49.gif] and [Graphics:svdgr50.gif] that makes [Graphics:svdgr51.gif] zero. 

The same argument holds for any m by 2 matrix. 


Proof 2: General case

The proof above works well enough for m x 2 matrices, but this proof shows that for all m x n matrices there's is a perpendicular frame [Graphics:svdgr52.gif] of unit vectors so that [Graphics:svdgr53.gif] for [Graphics:svdgr54.gif]

Here's one way to get such a perpframe. 

Start with a m x n  matrix A

Step 1: Let [Graphics:svdgr55.gif] be a unit vector maximizing [Graphics:svdgr56.gif]

Step 2: Let [Graphics:svdgr57.gif] be the space perpendicular to [Graphics:svdgr58.gif] 
Let [Graphics:svdgr59.gif] be a unit vector in [Graphics:svdgr60.gif] maximizing [Graphics:svdgr61.gif]

Step 3: Let [Graphics:svdgr62.gif] be the space perpendicular to [Graphics:svdgr63.gif] 
Let [Graphics:svdgr64.gif] be a unit vector in [Graphics:svdgr65.gif] maximizing [Graphics:svdgr66.gif]
 . 
 . 
 . 
Step n: Let [Graphics:svdgr67.gif] be the space perpendicular to [Graphics:svdgr68.gif] 
Let [Graphics:svdgr69.gif] be a unit vector in [Graphics:svdgr70.gif] maximizing [Graphics:svdgr71.gif]
 
 

This gets you a perpendicular frame of unit vectors [Graphics:svdgr72.gif], but how do you know that [Graphics:svdgr73.gif] for [Graphics:svdgr74.gif]
 

Look at [Graphics:svdgr75.gif] and see that 
 

  • no matter what [Graphics:svdgr76.gif] you choose [Graphics:svdgr77.gif] is a unit vector in [Graphics:svdgr78.gif] 
  • [Graphics:svdgr79.gif]
 

Since [Graphics:svdgr80.gif] is a unit vector in [Graphics:svdgr81.gif] maximizing [Graphics:svdgr82.gif] and [Graphics:svdgr83.gif] is in [Graphics:svdgr84.gif] for all [Graphics:svdgr85.gif], you know that [Graphics:svdgr86.gif] has a maximum at [Graphics:svdgr87.gif]
 

This tells you [Graphics:svdgr88.gif]
 

Now compute: 
[Graphics:svdgr89.gif] 
 

[Graphics:svdgr90.gif] 
 

[Graphics:svdgr91.gif] 
 

[Graphics:svdgr92.gif] 
 
 

When you remember [Graphics:svdgr93.gif][Graphics:svdgr94.gif], and [Graphics:svdgr95.gif] are just numbers, you understand that its nothing more than tedious to compute: 

[Graphics:svdgr96.gif] 
 

Plugging in [Graphics:svdgr97.gif] gives you [Graphics:svdgr98.gif]
 

But you already know that [Graphics:svdgr99.gif], so after canceling the [Graphics:svdgr100.gif] you get [Graphics:svdgr101.gif] which is just what you wanted. 


Proof 3: Based on the spectral theorem

This proof is slick IF YOU'VE ALREADY SEEN THE SPECTRAL THEOREM. 

If you haven't seen the spectral theorem, then skip this proof. 

Given [Graphics:svdgr102.gif] and an orthonormal basis [Graphics:svdgr103.gif] of [Graphics:svdgr104.gif]

[Graphics:svdgr105.gif] 

iff [Graphics:svdgr106.gif] 

iff [Graphics:svdgr107.gif] 

iff [Graphics:svdgr108.gif] are all eigenvectors of [Graphics:svdgr109.gif]

Conclusion: The desired basis is guaranteed by spectral theorem since [Graphics:svdgr109.gif] is symmetric. 


Theorem: Every matrix has a singular value decomposition. 

The theorem above almost gives you the SVD for any matrix. 

The only problem is that although the columns of the "hanger" matrix are pairwise perpendicular, they might not form a basis for [Graphics:svdgr110.gif]

For example, suppose for a 5x4 matrix [Graphics:svdgr111.gif] the procedure outlined above gives you: 

[Graphics:svdgr112.gif]
 

To complete the decomposition, let [Graphics:svdgr113.gif] be an orthonormal basis for the three dimensional subspace of [Graphics:svdgr114.gif] perpendicular to [Graphics:svdgr115.gif]
 

Then write 

[Graphics:svdgr116.gif] 
[Graphics:svdgr117.gif] 
 
(1) The two sides agree on the basis [Graphics:svdgr118.gif]
 
 

This, finally, is a singular value decomposition for A.  



 

Comments 
 

  • The diagonal entries of the stretcher matrix are called the "singular values of A". 

  •  
  • An extra row of zeros has been added to the stretcher matrix to produce the dimensions required for the multiplication. If A is m x n with [Graphics:svdgr119.gif], then rows will be deleted. 
  • In either case, the dimensions of the stretcher matrix will always match the dimensions of A
     

  • The decomposition shows that the action of every matrix can be described as a rotation followed by a stretch followed by another rotation. 

  •  
  • The proofs above are meant to show that every matrix has an SVD.  You can compute SVD's for mx2 matrices by hand, but you should use a machine to compute SVD's for bigger matrices.
 


Exercises

1. Above, you saw that if A is a [Graphics:svdgr120.gif] matrix [Graphics:svdgr121.gif] then the SVD for A has the form 
[Graphics:svdgr122.gif]

Give the form for the SVD of a 4x5 matrix [Graphics:svdgr123.gif]

(Hint: The matrix has five singular values, but [Graphics:svdgr124.gif] does not appear in the decomposition.) 
 


2. Let [Graphics:svdgr125.gif]. Following the steps of Proof 1 above, define [Graphics:svdgr126.gif] and [Graphics:svdgr127.gif]

a. Expand [Graphics:svdgr128.gif] 

b. After applying trig identities it turns out that [Graphics:svdgr129.gif]

Use this fact to find a t so that [Graphics:svdgr130.gif] 

c. Using part (b) as a start, give an SVD for A
 


© 1999 Todd Will
Last Modified: 22-Feb-1999
Hit Counter