Friday, April 20, 2007

Gauss-Jordan Elimination

Gauss-Jordan Elimination is a method for solving a linear system of equations. The method is named after Carl Friedrich Gauss and Wilhelm Jordan.

The content in today's blog is taken from Linear Algebra with Applications by Gareth Williams.

Consider the following system of linear equations:

x1 - x2 + 2x3 = 3

2x1 - 2x2 + 5x3 = 4

x1 + 2x2 - x3 = -3

2x2 + 2x3 = 1

To use Gauss-Jordan Elimination, we start by representing the given system of linear equations as an augmented matrix.

Definition 1: Augmented Matrix Form

An augmented matrix is a matrix representation of a system of linear equations where each row of the matrix is the coefficients of the given equation and the equation's result.

Example 1:

The following system of linear equations:

x1 - x2 + 2x3 = 3

2x1 - 2x2 + 5x3 = 4

x1 + 2x2 - x3 = -3

2x2 + 2x3 = 1

would be represented as:



The goal of Gauss-Jordan elimination is to transform the matrix into reduced echelon form.

Definition 2: Reduced Echelon Form

A matrix is said to be in reduced echelon form if:

(1) Any rows consisting entirely of zeros are grouped at the bottom of the matrix.

(2) The first nonzero element of any row is 1. This element is called a leading 1.

(3) The leading 1 of each row after the first is positioned to the right of the leading 1 of the previous row.

(4) If a column contains a leading 1, then all other elements in that column are 0.

Here are some examples of matrices that are not in reduced echelon form.

The matrix below violates (1). The rows consisting entirely of 0's are not grouped at the bottom.



The matrix below violates (2). In row 2, the first nonzero element is not a 1.



The matrix below violates (3). In row 3, the leading 1 is not to the right of the leading 1 in row 2.



The matrix below violates (4). The column associated with the leading 1 in row 2 contains a nonzero value.



Below are some examples of matrices that are in reduced echelon form:







Next, we need to use elementary row operations so the matrix results in reduced echelon form. The elementary row operations are operations that the result matrix represents the same system of linear equations. In other words, the two matrices are row equivalent.

Definition 3: Row Equivalence (Equivalent Systems)

Two matrices are row equivalent if they represent the same system of linear equations. They are also called equivalent systems.

The following three elementary operations preserve row equivalence. That is, the resulting matrix is row equivalent to the matrix before the operation.

(1) Interchanging two rows
(2) Multiplying the elements of a row by a nonzero constant
(3) Adding the elements of one row to the corresponding elements of another row.

(1) is clear. Changing the order of the linear equations does not change the values that solve them. (2) is equivalent to multiplying the same value to both sides of the equation. (3) is the idea that two linear equations imply that their sum is also a valid linear equation. In actual practice, we will combine (2) and (3) to get: (4) Add a multiple of the elements of one row to the corresponding elements of another row.

So, the following elementary operations result in a matrix that is row equivalent to the previous matrix:

Definition 4: Elementary Operations

(1) Row Exchange: The exchange of two rows.
(2) Row Scaling: Multiply the elements of a row by a nonzero constant.
(3) Row Replacement: Add a multiple of the elements of one row to the corresponding elements of another row.

Once the matrix is in reduced echelon form, we can convert the matrix back into a linear system of equations to see the solution.

Here's the full algorithm:

Definition 5: Gauss-Jordan Elimination

(1) Represent the linear system of equations as a matrix in augmented matrix form

(2) Use elementary row operations to derive a matrix in reduced echelon form

(3) Write the system of linear equations corresponding to the reduced echelon form.

Algorithm 1: Gauss-Jordan Elimination Algorithm

(1) Let A = an m x n matrix with m rows and n columns.

(2) Let k = 1, Ak = A, i=1

(3) If Ak is all zero's, then we are done and the A is in reduced echelon form.

(4) Let Cv be the first nonzero column in Ak such that Cv is not all zero's and v ≥ i.

(5) Let au,v be the first nonzero element in Cv such that u ≥ k.

(6) Let Ru be the row in A where au,v is found in Cv.

(7) Ru = Ru*(1/au,v) [After this, Ru has a leading 1, see proof below for explanation]

(8) If u ≠ k, then exchange Ru and Rk.

(9) For each row Rw in A, if w ≠ k, then Rw = Rw + (-aw,v)*Rk [After this, the only nonzero element in Cv is found at ak,v, see proof below for explanation if needed]

(10) If k = m, then we are done and A is in reduced echelon form.

(11) Otherwise, we designate a partion of Ak such that:



[Note: In other words, Ak+1 = all but the top row of Ak]

(12) Let k = k + 1; i = v; and go to step #3

End

Here's an example:

Example 2: Using Gauss-Jordan Elimination

(1) We start with a system of linear equations:

x1 - 2x2 + 4x3 = 12

2
x1 - x2 + 5x3 = 18

-x1 + 3x2 - 3x3 = -8

(2) We represent them in augmented matrix form:



(3) We use elementary row operations to put this matrix into reduced echelon form (I will use R1, R2, R3 for Row 1, Row 2, and Row 3):

(a) Let R2 = R2 + (-2)R1 [Operation #3]



(b) Let R3 = R3 + R1 [Operation #3]



(c) Let R2 = (1/3)R2 [Operation #2]



(d) Let R1 = R1 + 2*R2 [Operation #3]



(e) Let R3 = R3 + (-1)*R2 [Operation #3]



(f) Let R3 = (1/2)R3 [Operation #2]



(g) Let R1 = R1 + (-2)*R3 [Operation #3]



(h) Let R2 = R2 + R3 [Operation #3]



(4) We now write down the system of linear equations corresponding to the reduced echelon form:

x1 = 2

x2 = 1

x3 = 3

Of course, we need to be sure that we can always get to reduced echelon form. Here is the theorem that guarantees this:

Theorem 1: Every matrix is row equivalent to a reduced echelon form matrix

Proof:

(1) The Gauss-Jordan Elimination Algorithm works for any 1 x n matrix.

(a) Let A = 1 x n matrix

(b) Assume that A is not all zeros (if it were, then A is already in reduced echelon form)

(c) Let a1,i be the first nonzero element in A.

(d) Let A = (1/a1,i)*A

(f) A now has a leading 1 and since this is a one-row matrix, A is now in reduced echelon form.

(2) Assume that the Gauss-Jordan Elimination Algorithm works for any matrix up to k rows.

(3) We can now show that it will work for a matrix with k+1 rows.

(a) Let A = a nonzero matrix with k+1 rows [We can assume it is nonzero since if it were zero, it would already be in reduced echelon form]

(b) Assume that we've run the Gauss-Jordan Elimination Algorithm up to the kth row so that if we define B such that:



(c) We can see that B is in reduced echelon form since

(i) B has k rows

(ii) We can assume by our assumption in step #2 that the algorithm works for any matrix of k rows or less

(iii) The algorithm leaves B unchanged if we run it again.

(d) We can also see that Ak+1 consists of a single row which is zero for every column where there is a leading 1 in B. [Since these are zero'd out by the algorithm above]

(e) If there is a nonzero column in Ak+1, it must therefore be to the right of all the leading 1s in B.

(f) Assume that there is a nonzero column in Ak+1 [If there were not, we would be done, since then Ak+1 would consist of all zero's and A would then be in reduced echelon form]

(g) In this case, none of the rows in B are all zeros. [In order for Ak+1 to be nonzero, each of the previous rows must have had at least one nonzero column. If one was all zeros, it would have been exchanged with Ak+1 and then Ak+1 would be all zeros which it is not]

(h) Let ak+1,x = the first nonzero element in Ak+1

(i) Let Ak+1 = (1/ak+1,x)Ak+1

(j) For all rows i, 1 thru k, let Ri = Ri + (-ai,x)*Rk+1.

(k) Clearly, this will zero out each of the remaining rows.

(l) A is now in reduced echelon form since:

(a) Any rows consisting entirely of zeros are grouped at the bottom of the matrix. [Note: there are none]

(b) The first nonzero element of any row is 1. [We assumed this was true of all rows in B and now we have shown it is also true of Rk+1]

(c) The leading 1 of each row after the first is positioned to the right of the leading 1 of the previous row. [From step #3e]

(d) If a column contains a leading 1, then all other elements in that column are 0. [From step #3j]

QED

References

Tuesday, April 17, 2007

Properties of Matrix Multiplication

Multiplication can only occur between matrices A and B if the number of columns in A match the number of rows in B. This presents the very important idea that while multiplication of A with B might be a perfectly good operation; this does not guarantee that multiplication of B with A is a perfectly good operation.

Even if matrix A can be multiplied with matrix B and matrix B can be multiplied to matrix A, this doesn't necessarily give us that AB=BA. In other words, unlike the integers, matrices are noncommutative.

Property 1: Associative Property of Multiplication

A(BC) = (AB)C

where A,B, and C are matrices of scalar values.

Proof:

(1) Let D = AB, G = BC
(2) Let F = (AB)C = DC
(3) Let H = A(BC) = AG
(4) Using Definition 1, here, we have for each D,F,G,H:

di,j = ∑k ai,k*bk,j

gi,j = ∑k bi,k*ck,j

fi,j = ∑k di,k*ck,j

(5) So, expanding fi,j gives us:

fi,j = ∑k (∑l ai,l*bl,j)ck,j =

(∑kl) ai,l*bl,k*ck,j =

= ∑l ai,l*(∑k bl,k*ck,j) =

= ∑l ai,l*gl,j = hi,j

QED

Property 2: Distributive Property of Multiplication

A(B + C) = AB + AC
(A + B)C = AC + BC

where A,B,C are matrices of scalar values.

Proof:

(1) Let D = AB such that for each:

di,j = ∑k ai,k*bk,j

(2) Let E = AC such that for each:

ei,j = ∑k ai,k*ck,j

(3) Let F = D + E = AB + AC such that for each:

fi,j = ∑k ai,k*bk,j+ai,k*ck,j = ∑k ai,k[bk,j + ck,j]

(4) Let G = B+C such that for each:

gi,j = bi,j + ci,j

(5) Let H = A(B+C) = AG such that for each:

hi,j = ∑k ai,k*gk,j

(6) Then we have AB + AC = A(B+C) since for each:

hi,j = ∑k ai,k[bk,j + ck,j]

(7) Let M = A + B such that for each:

mi,j = ai,j + bi,j

(8) Let N = (A+B)C = MC such that:

ni,j = ∑k mi,k*ck,j =

= ∑k (ai,k + bi,k)*ck,j

(9) Let O = BC such that:

oi,j = ∑k bi,k*ck,j

(10) Let P = AC + BC = E + O such that:

pi,j = ei,j + oi,j =

= ∑k ai,k*ck,j + ∑k bi,k*ck,j =

= ∑k [ai,k*ck,j + bi,k*ck,j] =

= ∑k (ai,k + bi,k)*ck,j

QED

Property 3: Scalar Multiplication

c(AB) = (cA)B = A(cB)

Proof:

(1) Let D = AB such that:

di,j = ∑k ai,k*bk,j

(2) Let E = c(AB) = cD such that for each:

ei,j = c*di,j = c*∑k ai,k*bk,j

(3) Let F = (cA)B such that:

fi,j = ∑k (c*ai,k)*bk,j = c*∑k ai,k*bk,j

(4) Let G = A(cB) such that:

gi,j = ∑k ai,k*(c*bk,j) =

= c*∑k ai,k*bk,j

QED

Property 4: Muliplication of Matrices is not Commutative

AB does not have to = BA

Proof:

(1) Let A =



(2) Let B =



(3) AB =



(4) BA =



QED

References

Monday, April 16, 2007

Matrix Addition

The content in today's blog is taken straight from Gareth Williams' Linear Algebra.

Matrices in terms of their properties present an interesting contrast to numbers. Whereas with numbers, we can perform addition and subtraction on all elements, with matrices, this is not the case.

Indeed, one of the most important properties of matrices is that there are certain preconditions required before operations can be performed on two matrices. For example, addition can only occur between two matrices that have the same dimensions. It is not possible to add a 1 x 2 matrix with a 2 x 2 matrix.

Definition 1. Zero Matrix: Zm,n

A zero matrix is any matrix which consists completely of 0's.

For example, below is Z2,3:



Below are the basic properties of matrix addition.

Property 1: Commutative property of Addition

A + B = B + A

where A and B are matrices of the same dimension and consist of scalar values.

Proof:

(1) Let A = set of values ai,j where i is the row and j is the column.

(2) Let B = set of values bi,j

(3) A + B = the set of values ai,j + bi,j

(4) B + A = the set of values bi,j + ai,j

(5) Clearly, since all ai,j and bi,j are scalar, ai,j + bi,j = bi,j + ai,j

QED

Property 2: Associative property of Addition

A + (B + C) = (A + B) + C

where A, B, and C are matrices of the same dimension and consist of scalar values.

Proof:

(1) All A,B,C have the same dimensions since this is a prerequisite for addition.

(2) Let A = the set ai,j, B = the set bi,j, C = the set ci,j

(3) A + (B + C) = the set ai,j + (bi,j + ci,j)

(4) (A + B) + C = the set (ai,j + bi,j) + ci,j

(5) Since ai,j, bi,j, ci,j are all scalar, ai,j + (bi,j + ci,j) = (ai,j + bi,j) + ci,j

QED

Property 3: Addition by Zm,n

A + Zm,n = Zm,n+ A = A

where A is an m x n matrix which consists of scalar values.

Proof:

(1) Let A be the set of ai,j

(2) Let Z be the set of zi,j where each zi,j=0 [See Definition 1 above]

(3) A + Zm,n = the set of ai,j + zi,j = the set of ai,j = A

(4) Zm,n + A = the set of zi,j + ai,j = ai,j = A.

QED

Property 4: Distributive Property of Scalars with Addition of Matrices

c(A + B) = cA + cB

where A and B are matrices of the same dimension and consist of scalar values and c is a scalar value.

Proof:

(1) Let A = the set of ai,j, let B = the set of bi,j

(2) A + B = the set ai,j + bi,j

(3) c(A+B) = the set c(ai,j + bi,j) = ai,jc + bi,jc .

(4) cA + cB = the set of ai,jc + the set of bi,jc = the set of ai,jc + bi,jc.

QED

Property 5: Distributive Property of Matrices with Addition of Scalars

(a + b)C = aC + bC

where a,b are scalar values and C is a matrix of scalar values.

Proof:

(1) Let C = the set of ci,j

(2) (a + b)C = the set of (a + b)ci,j = aci,j + bci,j

(3) aC + bC = the set of aci,j+ the set of bci,j = the set of aci,j + bci,j

QED

Property 6: Associative Property of Multiplication of Scalars with Matrices

(ab)C = a(bC)

where a,b are scalar values and C is a matrix of scalar values.

Proof:

(1) Let C be the set of ci,j

(2) (ab)C = the set of abci,j

(3) a(bC) = a*(the set of bci,j) = the set of abci,j

QED

References