Thursday, January 14, 2010

Matrix multiplication

I'm probably going to be teaching an introductory linear algebra class this year, so I've spent some time trying to come up with an intuitive way to explain matrix multiplication. (Admittedly, half the reason I'm typing this up is to test out this awesome website for putting matrices into HTML!) Almost everyone with a high school education knows how to mechanically multiply two matrices together, but, for some reason, these mechanical instructions aren't always accompanied by an explanation of why you'd ever want to do this, or why it follows the weird rule that it does.

The cleanest, one-sentence explanation of matrix multiplication is that a matrix is just a special kind of function ("build a linear combination of the matrix's columns"), and multiplication of matrices is just the composition of functions: "build a linear combination of the first matrix's columns, and use this as the input to build a linear combination of the second matrix's columns."

Here is a simple example that may help illustrate this somewhat abstract concept. Suppose you are looking at some kind of iterative process. For example, coupled probabilities of something happening: let's say that, each year, there's a 25% chance of someone living in South Carolina (x) moving to California (y), but only a 1% chance of someone living in California moving to South Carolina. (For simplicity, assume these are the only two options.) In this scenario, you start out in one of the two states: if you're living in South Carolina, then x0 = 1 and y0 = 0. If you're living in California, x0 = 0 and y0 = 1. After 1 year passes, what are your chances of being in either state, x1 and y1? These are linear equations:

x1 = 0.75 x0 + 0.01 y0
y1 = 0.25 x0 + 0.99 y0

After 2 years:

x2 = 0.75 x1 + 0.01 y1 = 0.75 (0.75 x0 + 0.01 y0) + 0.01 (0.25 x0 + 0.99 y0)
y2 = 0.25 x1 + 0.99 y1 = 0.25 (0.75 x0 + 0.01 y0) + 0.99 (0.25 x0 + 0.99 y0)

etc. for as many years as you want. Clearly, after a few years, these equations are going to turn into a confusing mess. A way of cleaning this up is to write the coefficients as a matrix:


 
x1
y1
 
=
 
0.75 0.01
0.25 0.99
 
 
x0
y0
 

 
x2
y2
 
=
 
0.75 0.01
0.25 0.99
 
 
x1
y1
 
=
 
0.75 0.01
0.25 0.99
 
 
0.75 0.01
0.25 0.99
 
 
x0
y0
 
=
 
0.75 0.01
0.25 0.99
 
2
 
 
 
x0
y0
 

where side-by-side matrices indicate that they should be multiplied together following the standard rule of (element i,j of the resulting matrix) = (dot product of row i from the left matrix with column j from the right matrix). The matrix

 
0.75 0.01
0.25 0.99
 

is the "transition matrix" for this process. Matrix multiplication is an easy way of encoding an iterative, coupled process. This is one of the reasons the eigenvalue decomposition is so useful: even though it's easier to multiply together lots of these matrices then to work out the iterative process by hand, it's even easier if you can diagonalize the matrix! (The introductory linear algebra class will also cover eigenvalues. I may type up an explanation for that later, too...)

No comments:

Post a Comment