Math Refresher: 2007-04-15

Friday, April 20, 2007

Gauss-Jordan Elimination

Gauss-Jordan Elimination is a method for solving a linear system of equations. The method is named after Carl Friedrich Gauss and Wilhelm Jordan.

The content in today's blog is taken from Linear Algebra with Applications by Gareth Williams.

Consider the following system of linear equations:

x₁ - x₂ + 2x₃ = 3

2x₁ - 2x₂ + 5x₃ = 4

x₁ + 2x₂ - x₃ = -3

2x₂ + 2x₃ = 1

To use Gauss-Jordan Elimination, we start by representing the given system of linear equations as an augmented matrix.

Definition 1: Augmented Matrix Form

An augmented matrix is a matrix representation of a system of linear equations where each row of the matrix is the coefficients of the given equation and the equation's result.

Example 1:

The following system of linear equations:

x₁ - x₂ + 2x₃ = 3

2x₁ - 2x₂ + 5x₃ = 4

x₁ + 2x₂ - x₃ = -3

2x₂ + 2x₃ = 1

would be represented as:

The goal of Gauss-Jordan elimination is to transform the matrix into reduced echelon form.

Definition 2: Reduced Echelon Form

A matrix is said to be in reduced echelon form if:

(1) Any rows consisting entirely of zeros are grouped at the bottom of the matrix.

(2) The first nonzero element of any row is 1. This element is called a leading 1.

(3) The leading 1 of each row after the first is positioned to the right of the leading 1 of the previous row.

(4) If a column contains a leading 1, then all other elements in that column are 0.

Here are some examples of matrices that are not in reduced echelon form.

The matrix below violates (1). The rows consisting entirely of 0's are not grouped at the bottom.

The matrix below violates (2). In row 2, the first nonzero element is not a 1.

The matrix below violates (3). In row 3, the leading 1 is not to the right of the leading 1 in row 2.

The matrix below violates (4). The column associated with the leading 1 in row 2 contains a nonzero value.

Below are some examples of matrices that are in reduced echelon form:

Next, we need to use elementary row operations so the matrix results in reduced echelon form. The elementary row operations are operations that the result matrix represents the same system of linear equations. In other words, the two matrices are row equivalent.

Definition 3: Row Equivalence (Equivalent Systems)

Two matrices are row equivalent if they represent the same system of linear equations. They are also called equivalent systems.

The following three elementary operations preserve row equivalence. That is, the resulting matrix is row equivalent to the matrix before the operation.

(1) Interchanging two rows
(2) Multiplying the elements of a row by a nonzero constant
(3) Adding the elements of one row to the corresponding elements of another row.

(1) is clear. Changing the order of the linear equations does not change the values that solve them. (2) is equivalent to multiplying the same value to both sides of the equation. (3) is the idea that two linear equations imply that their sum is also a valid linear equation. In actual practice, we will combine (2) and (3) to get: (4) Add a multiple of the elements of one row to the corresponding elements of another row.

So, the following elementary operations result in a matrix that is row equivalent to the previous matrix:

Definition 4: Elementary Operations

(1) Row Exchange: The exchange of two rows.
(2) Row Scaling: Multiply the elements of a row by a nonzero constant.
(3) Row Replacement: Add a multiple of the elements of one row to the corresponding elements of another row.

Once the matrix is in reduced echelon form, we can convert the matrix back into a linear system of equations to see the solution.

Here's the full algorithm:

Definition 5: Gauss-Jordan Elimination

(1) Represent the linear system of equations as a matrix in augmented matrix form

(2) Use elementary row operations to derive a matrix in reduced echelon form

(3) Write the system of linear equations corresponding to the reduced echelon form.

Algorithm 1: Gauss-Jordan Elimination Algorithm

(1) Let A = an m x n matrix with m rows and n columns.

(2) Let k = 1, A_k = A, i=1

(3) If A_k is all zero's, then we are done and the A is in reduced echelon form.

(4) Let C_v be the first nonzero column in A_k such that C_v is not all zero's and v ≥ i.

(5) Let a_u,v be the first nonzero element in C_v such that u ≥ k.

(6) Let R_u be the row in A where a_u,v is found in C_v.

(7) R_u = R_u*(1/a_u,v) [After this, R_u has a leading 1, see proof below for explanation]

(8) If u ≠ k, then exchange R_u and R_k.

(9) For each row R_w in A, if w ≠ k, then R_w = R_w + (-a_w,v)*R_k [After this, the only nonzero element in C_v is found at a_k,v, see proof below for explanation if needed]

(10) If k = m, then we are done and A is in reduced echelon form.

(11) Otherwise, we designate a partion of A_k such that:

[Note: In other words, A_k+1 = all but the top row of A_k]

(12) Let k = k + 1; i = v; and go to step #3

End

Here's an example:

Example 2: Using Gauss-Jordan Elimination

(1) We start with a system of linear equations:

x₁ - 2x₂ + 4x₃ = 12

2x₁ - x₂ + 5x₃ = 18

-x₁ + 3x₂ - 3x₃ = -8

(2) We represent them in augmented matrix form:

(3) We use elementary row operations to put this matrix into reduced echelon form (I will use R1, R2, R3 for Row 1, Row 2, and Row 3):

(a) Let R2 = R2 + (-2)R1 [Operation #3]

(b) Let R3 = R3 + R1 [Operation #3]

(d) Let R1 = R1 + 2*R2 [Operation #3]

(e) Let R3 = R3 + (-1)*R2 [Operation #3]

(f) Let R3 = (1/2)R3 [Operation #2]

(g) Let R1 = R1 + (-2)*R3 [Operation #3]

(h) Let R2 = R2 + R3 [Operation #3]

(4) We now write down the system of linear equations corresponding to the reduced echelon form:

x₁ = 2

x₂ = 1

x₃ = 3

Of course, we need to be sure that we can always get to reduced echelon form. Here is the theorem that guarantees this:

Theorem 1: Every matrix is row equivalent to a reduced echelon form matrix

Proof:

(1) The Gauss-Jordan Elimination Algorithm works for any 1 x n matrix.

(a) Let A = 1 x n matrix

(b) Assume that A is not all zeros (if it were, then A is already in reduced echelon form)

(c) Let a_1,i be the first nonzero element in A.

(d) Let A = (1/a_1,i)*A

(f) A now has a leading 1 and since this is a one-row matrix, A is now in reduced echelon form.

(2) Assume that the Gauss-Jordan Elimination Algorithm works for any matrix up to k rows.

(3) We can now show that it will work for a matrix with k+1 rows.

(a) Let A = a nonzero matrix with k+1 rows [We can assume it is nonzero since if it were zero, it would already be in reduced echelon form]

(b) Assume that we've run the Gauss-Jordan Elimination Algorithm up to the kth row so that if we define B such that:

(c) We can see that B is in reduced echelon form since

(i) B has k rows

(ii) We can assume by our assumption in step #2 that the algorithm works for any matrix of k rows or less

(iii) The algorithm leaves B unchanged if we run it again.

(d) We can also see that A_k+1 consists of a single row which is zero for every column where there is a leading 1 in B. [Since these are zero'd out by the algorithm above]

(e) If there is a nonzero column in A_k+1, it must therefore be to the right of all the leading 1s in B.

(f) Assume that there is a nonzero column in A_k+1 [If there were not, we would be done, since then A_k+1 would consist of all zero's and A would then be in reduced echelon form]

(g) In this case, none of the rows in B are all zeros. [In order for A_k+1 to be nonzero, each of the previous rows must have had at least one nonzero column. If one was all zeros, it would have been exchanged with A_k+1 and then A_k+1 would be all zeros which it is not]

(h) Let a_k+1,x = the first nonzero element in A_k+1

(i) Let A_k+1 = (1/a_k+1,x)A_k+1

(j) For all rows i, 1 thru k, let R_i = R_i + (-a_i,x)*R_k+1.

(k) Clearly, this will zero out each of the remaining rows.

(l) A is now in reduced echelon form since:

(a) Any rows consisting entirely of zeros are grouped at the bottom of the matrix. [Note: there are none]

(b) The first nonzero element of any row is 1. [We assumed this was true of all rows in B and now we have shown it is also true of R_k+1]

(c) The leading 1 of each row after the first is positioned to the right of the leading 1 of the previous row. [From step #3e]

(d) If a column contains a leading 1, then all other elements in that column are 0. [From step #3j]

QED

References

Gareth Williams, Linear Algebra with Applications, Wm. C. Brown Publishers, 1996.
"Row Reduction", PlanetMath.org
Hans Schneider, George Philip Barker, Matrices and Linear Algebra, 1989.

Tuesday, April 17, 2007

Properties of Matrix Multiplication

Multiplication can only occur between matrices A and B if the number of columns in A match the number of rows in B. This presents the very important idea that while multiplication of A with B might be a perfectly good operation; this does not guarantee that multiplication of B with A is a perfectly good operation.

Even if matrix A can be multiplied with matrix B and matrix B can be multiplied to matrix A, this doesn't necessarily give us that AB=BA. In other words, unlike the integers, matrices are noncommutative.

Property 1: Associative Property of Multiplication

A(BC) = (AB)C

where A,B, and C are matrices of scalar values.

Proof:

(1) Let D = AB, G = BC
(2) Let F = (AB)C = DC
(3) Let H = A(BC) = AG
(4) Using Definition 1, here, we have for each D,F,G,H:

d_i,j = ∑_k a_i,k*b_k,j

g_i,j = ∑_k b_i,k*c_k,j

f_i,j = ∑_k d_i,k*c_k,j

(5) So, expanding f_i,j gives us:

f_i,j = ∑_k (∑_l a_i,l*b_l,j)c_k,j =

(∑_k ∑_l) a_i,l*b_l,k*c_k,j =

= ∑_l a_i,l*(∑_k b_l,k*c_k,j) =

= ∑_l a_i,l*g_l,j = h_i,j

QED

Property 2: Distributive Property of Multiplication

A(B + C) = AB + AC
(A + B)C = AC + BC

where A,B,C are matrices of scalar values.

Proof:

(1) Let D = AB such that for each:

d_i,j = ∑_k a_i,k*b_k,j

(2) Let E = AC such that for each:

e_i,j = ∑_k a_i,k*c_k,j

(3) Let F = D + E = AB + AC such that for each:

f_i,j = ∑_k a_i,k*b_k,j+a_i,k*c_k,j = ∑_k a_i,k[b_k,j + c_k,j]

(4) Let G = B+C such that for each:

g_i,j = b_i,j + c_i,j

(5) Let H = A(B+C) = AG such that for each:

h_i,j = ∑_k a_i,k*g_k,j

(6) Then we have AB + AC = A(B+C) since for each:

h_i,j = ∑_k a_i,k[b_k,j + c_k,j]

(7) Let M = A + B such that for each:

m_i,j = a_i,j + b_i,j

(8) Let N = (A+B)C = MC such that:

n_i,j = ∑_k m_i,k*c_k,j =

= ∑_k (a_i,k + b_i,k)*c_k,j

(9) Let O = BC such that:

o_i,j = ∑_k b_i,k*c_k,j

(10) Let P = AC + BC = E + O such that:

p_i,j = e_i,j + o_i,j =

= ∑_k a_i,k*c_k,j + ∑_k b_i,k*c_k,j =

= ∑_k [a_i,k*c_k,j + b_i,k*c_k,j] =

= ∑_k (a_i,k + b_i,k)*c_k,j

QED

Property 3: Scalar Multiplication

c(AB) = (cA)B = A(cB)

Proof:

(1) Let D = AB such that:

d_i,j = ∑_k a_i,k*b_k,j

(2) Let E = c(AB) = cD such that for each:

e_i,j = c*d_i,j = c*∑_k a_i,k*b_k,j

(3) Let F = (cA)B such that:

f_i,j = ∑_k (c*a_i,k)*b_k,j = c*∑_k a_i,k*b_k,j

(4) Let G = A(cB) such that:

g_i,j = ∑_k a_i,k*(c*b_k,j) =

= c*∑_k a_i,k*b_k,j

QED

Property 4: Muliplication of Matrices is not Commutative

AB does not have to = BA

Proof:

(1) Let A =

(2) Let B =

(3) AB =

(4) BA =

QED

References

Hans Schneider, George Philip Barker, Matrices and Linear Algebra, 1989.

Monday, April 16, 2007

Matrix Addition

The content in today's blog is taken straight from Gareth Williams' Linear Algebra.

Matrices in terms of their properties present an interesting contrast to numbers. Whereas with numbers, we can perform addition and subtraction on all elements, with matrices, this is not the case.

Indeed, one of the most important properties of matrices is that there are certain preconditions required before operations can be performed on two matrices. For example, addition can only occur between two matrices that have the same dimensions. It is not possible to add a 1 x 2 matrix with a 2 x 2 matrix.

Definition 1. Zero Matrix: Z_m,n

A zero matrix is any matrix which consists completely of 0's.

For example, below is Z_2,3:

Below are the basic properties of matrix addition.

Property 1: Commutative property of Addition

A + B = B + A

where A and B are matrices of the same dimension and consist of scalar values.

Proof:

(1) Let A = set of values a_i,j where i is the row and j is the column.

(2) Let B = set of values b_i,j

(3) A + B = the set of values a_i,j + b_i,j

(4) B + A = the set of values b_i,j + a_i,j

(5) Clearly, since all a_i,j and b_i,j are scalar, a_i,j + b_i,j = b_i,j + a_i,j

QED

Property 2: Associative property of Addition

A + (B + C) = (A + B) + C

where A, B, and C are matrices of the same dimension and consist of scalar values.

Proof:

(1) All A,B,C have the same dimensions since this is a prerequisite for addition.

(2) Let A = the set a_i,j, B = the set b_i,j, C = the set c_i,j

(3) A + (B + C) = the set a_i,j + (b_i,j + c_i,j)

(4) (A + B) + C = the set (a_i,j + b_i,j) + c_i,j

(5) Since a_i,j, b_i,j, c_i,j are all scalar, a_i,j + (b_i,j + c_i,j) = (a_i,j + b_i,j) + c_i,j

QED

Property 3: Addition by Z_m,n

A + Z_m,n = Z_m,n+ A = A

where A is an m x n matrix which consists of scalar values.

Proof:

(1) Let A be the set of a_i,j

(2) Let Z be the set of z_i,j where each z_i,j=0 [See Definition 1 above]

(3) A + Z_m,n = the set of a_i,j + z_i,j = the set of a_i,j = A

(4) Z_m,n + A = the set of z_i,j + a_i,j = a_i,j = A.

QED

Property 4: Distributive Property of Scalars with Addition of Matrices

c(A + B) = cA + cB

where A and B are matrices of the same dimension and consist of scalar values and c is a scalar value.

Proof:

(1) Let A = the set of a_i,j, let B = the set of b_i,j

(2) A + B = the set a_i,j + b_i,j

(3) c(A+B) = the set c(a_i,j + b_i,j) = a_i,jc + b_i,jc .

(4) cA + cB = the set of a_i,jc + the set of b_i,jc = the set of a_i,jc + b_i,jc.

QED

Property 5: Distributive Property of Matrices with Addition of Scalars

(a + b)C = aC + bC

where a,b are scalar values and C is a matrix of scalar values.

Proof:

(1) Let C = the set of c_i,j

(2) (a + b)C = the set of (a + b)c_i,j = ac_i,j + bc_i,j

(3) aC + bC = the set of ac_i,j+ the set of bc_i,j = the set of ac_i,j + bc_i,j

QED

Property 6: Associative Property of Multiplication of Scalars with Matrices

(ab)C = a(bC)

where a,b are scalar values and C is a matrix of scalar values.

Proof:

(1) Let C be the set of c_i,j

(2) (ab)C = the set of abc_i,j

(3) a(bC) = a*(the set of bc_i,j) = the set of abc_i,j

QED

References

Gareth Williams, Linear Algebra with Applications, Wm. C. Brown Publishers, 1996.

Math Refresher