Saturday, December 08, 2007

Irreducible Polynomials

Definition 1: Irreducible Polynomials

A polynomial P ∈ F[X] is irreducible in F[X] if deg P is greater than 0 and P is not divisible by any polynomial Q ∈ F[X] such that 0 is less than deg Q is less than deg P.

NOTE: See Definition 1, here, for definition of polynomials and Definition 4, here, for definition of deg of a polynomial.

Lemma 1:

If a polynomial D divides an irreducible polynomial P,

Then, either D is a constant or deg D = deg P.

Proof:

(1) Let P be an irreducible polynomial

(2) Let D be a polynomial that divides P.

(3) Assume that D is not a constant and that deg D is less than deg P.

(4) Then D cannot divide P [by Definition 1 above]

(5) So we have a contradiction and we must reject our assumption at step #3.

QED

Corollary 1.1: Every polynomial of degree 1 is irreducible

(1) Assume that there exists a polynomial D of degree 1 that is not irreducible.

(2) Then there exists a polynomial P that divides D where deg P is less than deg D and deg P is greater than 0.

(3) But this is impossible since deg D = 1 (either deg P = 0 or deg P ≥ 1).

(4) So we reject our assumption at step #1.

QED

Theorem 2: Euclid's Lemma for Polynomials

For polynomials S,T,U if S divides TU and S is relatively prime to T, then S divides U.

Proof:

(1) Since GCD(S,T)=1, there exists polynomials V,W such that [See Corollary 3.1, here]:

SV + TW = 1

(2) Multiplying both sides by U gives us:

S(UV) + (TU)W = U

(3) This proves that S divides U since S divides TU.

QED

Corollary 2.1: Generalized Euclid's Lemma for Polynomials

If a polynomial S divides a product of r factors and is relatively prime to the first r-1 factors, then it divides the last one.

Proof:

(1) Let T = the first r-1 factors.

(2) Let U = the last factor

(3) Then S divides TU and GCD(S,T)=1 so using Theorem 2 above, we can conclude that S divides U.

QED

Corollary 2.2: If a polynomial is divisible by pairwise relatively prime polynomials, then it is divisible by their product

Proof:

(1) Let P1, ..., Pr be relatively prime polynomials which divide a polynomial P.

(2) For r=1, P1 = P so the corollary is true.

(3) Assume that it is true up to r-1.

(4) So, then we have:

P = P1*..*Pr-1*Q

(5) By Corollary 2.1 above, since Pr divides P, it follows that Pr must divide Q.

(6) So, then it follows that P1...*...*Pr must divide P.

QED

Theorem 3: Every non constant polynomial P ∈ F[X] is a finite product:

P = cP1P2...Pn

where c ∈ Fx and P1, P2, ..., Pn are monic irreducible polynomials where this factorization is unique. Order of these polynomials is not unique and there may include repeats of the same polynomial more than once.

Proof:

(1) If P is irreducible, then P = c*P1 where c is the leading coefficient of P and P1 = c-1P is irreducible and monic. [See Definition 1 above and Theorem 2, here]

(2) If P is reducible, then it can be written as a product of two polynomials of degree less than deg P.

(3) If the two polynomials less than P are irreducible, then we are done. If not, then we can continue to break each reducible polynomial into a product of reducible polynomials and then using step #1, we break this product into the desired result.

(4) The next step is to prove that this product independent of order is unique.

(5) Assume that:

P = cP1*...*Pn = dQ1*...*Qm

where c,d ∈ Fx and P1, ..., Pn, Q1, ..., Qm are irreducible polynomials.

(6) c = d since both equal the leading coefficient of P.

(7) So, we have:

P1*...*Pn = Q1*...*Qm

(8) Using Corollary 2.1 above, we know that P1 cannot be relatively prime to all Qi

(9) So, let's assume that there exists a Qi such that GCD(P1,Qi) ≠ 1. Let's label it Q1

(10) Let D be the unique monic GCD of P1,Q1. [See Theorem 2, here]

(11) Since P1 is irreducible, then P1 = cD for some c.

(12) Since P1 is monic, then c = 1 and P1 = D.

(13) Since Q1 is monic and irreducible, we can use the same argument as in steps #11 and steps #12 to establish that Q1 = D.

(14) Therefore, it follows that Q1 = P1

(15) We can make the same argument for all n elements of Pi so that we have P2 = Q2, P3 = Q3, ..., Pn = Qn

(16) Since there must be a matched element for each Pi, it follows that n ≤ m.

(17) But we can also make the same line of argument for Qj so that we also know that m ≤ n

(18) Therefore n = m.

QED

References

Friday, December 07, 2007

Greatest Common Divisor for Polynomials

Definition 1: Divisor for Polynomials

Let P1, P2 ∈ F[X].

We say that P2 divides P1 if there exists Q ∈ F[X] such that P1 = P2Q

Definition 2: GCD for Polynomials

A greatest common divisor (GCD) of P1, P2 is a polynomial D ∈ F[X] which has the following properties:

(a) D divides P1 and P2
(b) If S is a polynomial which divides P1 and P2, then S divides D.

Definition 3: Relatively prime polynomials

Two polynomials P1, P2, then P1, P2 are said to relatively prime polynomials if the only factors they have in common are of degree 0.

Definition 4: degree: deg

The deg of a polynomial P is the greatest integer n for which the coefficient Xn in the expression of P is not zero.

Theorem 1: Euclid's Algorithm for Greatest Common Divisor for Polynomials

For any two polynomials P1, P2, there exists a GCD

Proof:

(1) Let P1, P2 be any two polynomials such that deg P1 ≥ deg P2

(2) If P2 = 0, then P1 is the GCD of P1, P2 [See Definition 2 above]

(3) Otherwise, we divide P1 by P2 using the Euclidean Division Algorithm for Polynomials. [See Theorem, here]

(4) Then there exists two polynomials Q1, R1 such that:

P1 = Q1P2 + R1

and deg R1 is less than deg P2.

(5) If R1 = 0, then P2 is the GCD of P1, P2

(6) Next, we divide P2 by R1 to get:

P2 = Q2R1 + R2

and deg R2 is less than deg R1

(7) If R2 ≠ 0, then we can set up the following equations:

(a) R1 = Q3R2 + R3

(b) R2 = Q4R3 + R4

...

(c) Rn-2 = QnRn-1 + Rn

(d) Rn-1 = Qn+1Rn + Rn+1

(8) Since deg P2 is greater than deg R1 which is greater than deg R2 which is greater than ... deg Rn which is greater deg Rn+1, this sequence cannot extend indefinitely.

(9) Therefore, Rn+1 = 0 for some n.

(10) Rn divides P1, P2 since:

Because Rn+1=0, Rn divides Rn-1

Because Rn divides Rn-1, from equation 7c, Rn divides Rn+1

We can now proceed up each of these implied equations in the same way until we get to 7b.

Since we have shown that Rn divides R2 and R3 before it, it is clear from 7a, that Rn divides R1

It is clear from step #6 that Rn divides P2 and clear from step #4 that Rn divides P1

(11) Assume that P1 and P2 are both divisible by a polynomial S.

(12) Then by step #4, S must divide R1.

(13) By step #6, S must divide R2

(14) We can now use the same argument to go through the equations in step #7 to conclude that S must likewise divide Rn.

(15) This proves that Rn is the GCD for P1,P2 [See Definition 2 above]

QED

Definition 5: monic polynomial

A polynomial is monic if and only if its leading coefficient is 1.

Theorem 2:

Any two polynomials P1, P2 ∈ F[X] which are not both 0 have a unique monic greatest common divisor D1 and a polynomial D ∈ F[X] is a greatest common divisor of P1, P2 if and only if D=cD1 for some c ∈ Fx (= F - {0}).

Proof:

(1) Let P1 ∈ F[X] be polynomials such that deg P1 is greater than deg P2

(2) Using Theorem 1 above, we know that there exists Rn such that Rn is the GCD of P1, P2

(3) Dividing Rn by its leading coefficient gives us a monic GCD of P1,P2

(4) Assume that D, D' are GCD's of P1, P2

(5) Then D divides D' and D' divides D. [See Definition 2 above]

(6) Then there exists polynomials Q,Q' ∈ F[X] such that:

D' = DQ
D = D'Q'

(7) So that:

Q = D'/D
Q' = D/D'

QQ' = (D'/D)(D/D') = 1

(8) But since Q'Q=1, they must both be constants which are inverses of each other.

(9) So, if D and D' are monic, then Q=Q'=1 and D=D' [Otherwise if Q ≠ 1, then D'=DQ would imply that D' is not be monic]

(10) Suppose D is any GCD of P1, P2

(11) Let D' be the unique monic GCD of P1, P2

(12) Let a = the leading coefficient of D.

(13) It is clear that (1/a)D = D' since the monic GCD is unique [see Step #9 above]

QED

Theorem 3:

If D is the GCD of P1, P2, then there exists polynomials U1, U2 such that:

D = P1U1 + P2U2 where U1, U2 ∈ F[X]

Proof #1:

(1) Using step #7c from Theorem 1 above, we have:

Rn-2 = QnRn-1 + Rn

which gives us:

Rn = Rn-2 - QnRn-1

and likewise:

Rn-1 = Rn-3 - Qn-1Rn-2

(2) We can then use the same type of equation to resolve Rn-1 to get:

Rn = Rn-2 - Qn(Rn-3 - Qn-1Rn-2) =

= Rn-2 - QnRn-3 + QnQn-1Rn-2 =

= -Rn-3Qn + Rn-2(1 +QnQn-1)

(3) We've now put Rn in terms of Rn-3 and Rn-2

(4) In this way, we can eliminate each Ri in terms of Ri-1 and Ri-2 up until we resolve R1 into P1, P2

(5) Eventually we get to an expression of Rn such that:

Rn = P1U1 + P2U2

QED

Proof #2:

(1) From step #4 of Theorem 1 above, we have:

P1 = Q1P2 + R1

(2) Using Matrix Theory (see here for review of 2 x 2 matrices), this gives us:



(3) From step #6 of Theorem 1 above we have:

P2 = Q2R1 + R2

which gives us:



(4) We can continue this approach until step #7d:

Rn-1 = Qn+1Rn + Rn+1

which gives us:



(5) Combining the matrix equations gives us:



(6) Now, since we know that:





(7) We see that each:



is invertible (see here for review of invertibility of matrices).

(8) We can then rearrange the equation in step #5 to be:



(9) Now, we can define U1, U2, U3, U4 such that:



(10) It now follows that:



(11) This then gives us that:

U1*P1 + U2*P2 = Rn

and

U3*P1 + U4*P2 = 0

QED

Corollary 3.1:

If P1, P2 are relatively prime polynomials in F[X], then there exists polynomials U1, U2 such that:

P1U1 + P2U2 = 1

Proof:

Since for relatively prime polynomials GCD = 1, this result follows directly from Theorem 3 above.

QED

References

Tuesday, December 04, 2007

Polynomials defined

A polynomial expression is a mathematical construct that if often used and rarely defined. The most common form is a polynomial in one variable which can be represented as:
a0xn + a1xn-1 + ... + a(n-1)x + an

In today's blog, I will present a more formal definition and then show the very fundamental result that a polynomial is a ring. If you are not familiar with rings, then please start here.

The content in today's blog is taken from Jean Tignol's excellent Galois' Theory of Algebraic Equations.

Definition 1: Polynomial in one indeterminant with coefficients in ring A

P: N → A such that { n ∈ N : Pn ≠ 0 } is finite. The set of all polynomials in one indeterminant with coefficients in ring A can be denoted as A[X].

Example 1:

Let us consider the polynomial:

a0 + a1X + ... + anXn

In this case, each ai is a coefficient in ring A.

We can see that for each i, that i ∈ N and that there exists an integer n ∈ N such that:

if i ≤ n, then i ∈ N maps to ai.

if i is greater than n, then i ∈ N maps to 0.

Definition 2: Addition of Polynomials

(P + Q)n = Pn + Qn

In other words, each of the maps corresponding to the same natural number are added together to form a new map.

Definition 3: Multiplication of Polynomials

(PQ)n = ∑ (i+j=n) Pi*Qj

In other words, each of the maps corresponding to the multiplication of all maps where the index of one plus the index of the other = n.

Lemma 1: If A is a ring, then A[X] is a ring which is commutative if and only if A is commutative

Proof:

(1) A[X] is a ring since:

(a) A[X] has a commutative operation of addition

P + Q = Pi + Qi for all i.

Q + P = Qi + Pi for all i.

Since A is a ring, Pi + Qi = Qi + Pi

(b) A[X] has an associative rule for addition

(P + Q) + R = (Pi + Qi) + Ri for all i

P + (Q + R) = Pi + (Qi + Ri) for all i

(Pi + Qi) + Ri = Pi + (Qi + Ri) since A is a ring.

(c) A[X] has an additive identity

Let Q be a polynomial such that all Qi map to 0.

Then for any polynomial P, it is clear that P + Q = Pi + 0 = Pi for all i.

(d) A[X] has an additive inverse

For any polynomial P:

Let Q be a polynomial derived from P such that for all i: Qi = -Pi

Then

P + Q = Pi + -Pi = 0 for all i.

(e) A[X] has an associative rule for multiplication

PQ = ∑ (i+j=n) Pi*Qj for all n

(PQ)R = ∑ (i+j+k=n) (Pi*Qj)Rk for all n =

= ∑ (i+j+k=n) Pi*(QjRk) for all n

QR = (j+k=n) Qj*Rk for all n

P(QR) = ∑ (i+j+k=n) Pi*(QjRk) for all n

(f) A[X] has a distributive rule

Q + R = Qj + Rj for all j

P(Q+R) = ∑ (i+j=n) Pi*(Qj + Rj) =

= ∑ (i+j=n) PiQj + PiRj

PQ = ∑ (i+j=n) Pi*Qj for all n

PR = ∑ (i+j=n) Pi*Rj for all n.

PQ + PR = ∑ (i+j=n) PiQj + ∑(i + j=n) PiRj for all n =

= ∑(i+j=n) PiQj + PiRj

(Q+R)P = ∑ (i+j=n) (Qj + Rj)*Pi =

= ∑ (i+j=n) PiQj + PiRj

(2) If A is a commutative ring, then A[X] is a commutative ring.

(a) Assume that A is a commutative ring

(b) PQ = ∑ (i+j=n) Pi*Qj for all n =

= ∑ (i+j=n) Qj*Pi for all n = QP

(3) If A[X] is a commutative ring, then A is a commutative ring

(a) Assume that A[X] is a commutative ring

(b) QP = PQ = ∑ (i+j=n) Pi*Qj for all n

(c) QP = ∑ (i+j=n) Qj*Pi for all n

(d) So, therefore for all n, ∑ (i+j=n) Pi*Qj = ∑ (i+j=n) Qj*Pi

QED

References

Breakdown of rational fractions as sums of partial fractions

Lemma 1: Rational fraction as sum of relatively prime polynomials

If P,Q,S1,S2 are polynomials such that:

Q = S1S2

S1,S2 are relatively prime polynomials

Then, there exists polynomials P0, P1, P2 such that:

P/Q = P0 + P1/S1 + P2/S2

deg Pi is less than deg Si for both i=1 and i=2

Proof:

(1) Since GCD(S1,S2) = 1, there exists polynomials T1, T2 such that [See Corollary 3.1, here]:

1 = S1T1 + S2T2

(2) Multiply each side by P/Q to get:

P/Q = (PS1T1)/Q + (PS2T2)/Q

(3) Since Q=S1S2, we now have:

P/Q = (PS1T1)/(S1S2) + (PS2T2)/(S1S2) = (PT1)/S2 + (PT2)/S1

(4) By the Euclidean Division Algorithm for Polynomials (see Theorem, here), there exists U1, U2, R1, R2 such that:

PT1 = S2U2 + R2

where deg R2 is less than deg S2

PT2 = S1U1 + R1

where deg R1 is less than deg S1

(5) Replacing PT1 and PT2 in step #3 gives us:

P/Q = (S2U2 + R2)/S2 + (S1U1 + R1)/S1 =

= (S2U2)/S2 + R2/S2 + (S1U1)/S1 + R1/S1 =

= (U2 + U1) + R1/S1 + R2/S2

QED

Lemma 2:

For polynomials Q,P:

If Q is irreducible and deg P is less than deg Qm, then:

P/Qm = P1/Q + P2/Q2 + ... + Pm/Qm

where deg Pi is less than deg Q for i = 1, ..., m

Proof:

(1) Using Euclidean Division for Polynomials (see Theorem, here), there exists P1,R1 such that:

P = P1Qm-1 + R1 where deg R1 is less than deg Qm-1

(2) deg P1 is less than deg Q since:

(a) Assume deg P1 ≥ deg Q

(b) Then P1Qm-1 is greater than deg Qm

(c) But this is impossible since deg P is less than deg Qm and since P1Qm-1 is less than P.

(d) So we reject our assumption at step #2a.

(3) We can now repeat this same step with P2 as a quotient for Qm-2 and so on.

(4) Eventually, we get to:

P = P1Qm-1 + P2Qm-2 + ... + Pm-1Q + Pm

(5) If we now divide both sides by Qm, we get:

P/Qm = P1/Q + P2/Q2 + ... + Pm-1/Qm-1 + Pm/Qm

QED

References

Tuesday, November 27, 2007

Derivative of tan x

Theorem: Derivative of tan x = sec2x

Proof:

(1) d(sin x)/dx = cos x [See Theorem 1, here]

(2) d(cos x)/dx = -sin x [See Theorem 2, here]

(3) Now, tan x = (sin x)/(cos x)

(4) Let f(x) = sin x, let g(x) = cos x

(5) If we assume that x ≠ ± Ï€/2 and x is between Ï€/2 and -Ï€/2, then we can see that f(x), g(x) is differentiable for all values of x and further that g(x) ≠ 0. [See here for review of cos, sin if needed]

(6) From step #5, we can now apply the Quotient Rule (see Lemma 6, here) to get:

d(tan x)/dx = [f'(x)g(x) - f(x)g'(x)]/[g(x)]2 = [cos x*cos x - sin x*(- sin x)]/[cos x]2 = [cos2 x + sin2 x]/(cos2 x)

(7) Now, using sin2x + cos2x = 1 [See Corollary 2, here], we have:

d(tan x)/dx = 1/(cos2 x)

(8) Since sec x = 1/cos x, we now have:

d(tan x)/dx = sec2 x

QED

Monday, November 26, 2007

Inverse Tangent

Definition 1: Inverse Tangent

y = tan-1x if and only if tan y = x where -π/2 is less than y is less than π/2.

Lemma: sec2x = 1 + tan2 x

Proof:

(1) By definition sec x = 1/cos x

(2) So, sec2x = 1/(cos2 x)

(3) sin2x + cos2x = 1 [See Corollary 2, here]

(4) 1/(cos2 x) = (cos2x + sin2x)/(cos2) =

= cos2x/(cos2x) + (sin2x)/(cos2) = 1 + tan2x.


QED

Theorem: D tan-1 x = 1/(1 + x2)

Proof:

(1) Let y = tan-1 x

(2) Then tan y = x [See Definition 1 above]

(3) d(tan y)/dx = d(x)/dx = 1

(4) d(tan y)/dx = (sec2y)dy/dx = 1 [See Theorem 1, here]

(5) dy/dx = 1/sec2y

(6) Using Lemma 1 above, we have:

dy/dx =1/(1 + tan2y)

(6) Using step #2, this gives us:

dy/dx = 1/(1 + tan2y) = 1/(1 + x2) [Since x = tan y]

QED

References

Sunday, October 14, 2007

A Shorter Proof of Euler's Formula

Euler's Formula is the very famous equation:

eix = cos x + isin x

In a previous blog, I showed how it can be derived using the Taylor Series. In today's blog, I will show how it can be derived in an even simpler way using concepts from calculus.

Theorem: Euler's Formula

eix = cos x + isin x

Proof:

(1) For some number x:

let y = cos x + isin x

(2) Taking the first derivative of both sides gives us:

dy/dx = -sin x + icos x

[For details if needed:

(a) dy/dx = d(cos x + isin x)/dx

(b) d(cos x + isin x)/dx = d(cos x)/dx + d(isin x)/dx (see Lemma 3, here)

(c) d(cos x)/dx = -sin x (see Theorem 2, here)

(d) d(isin x)/dx = icos x (see Theorem 1, here) ]

(3) Since i2 = -1 by definition, we have

-sin x + icos x = i(isin x + cos x)

(4) Combining step #1, #2, and #3, we get:

dy/dx = iy

[Since:

(a) y = isin x + cos x [from step #1 above]

(b) dy/dx = -sin x + icos x [from step #2 above]

(c) dy/dx = i(isin x + cos x) [from step #3 above]

(d) dy/dx = i(y)

]

(5) If we multiply (dx/y) to each side, we get:

(1/y)dy = (i)dx

(6) Now, if we take the integral of each side we get:

ln y = ix

[For details if needed:

(a) d(ln x)/dy = 1/x [See Lemma 1, here for proof]

(b) So using the Fundamental Theorem of Calculus (see Theorem 2, here), we know that:

∫ (1/x)dx = ln x + C

(c) So, ∫ (1/y)dy = ln y + C

(d) ∫ i(dx) = ix + C [Since d(ix + C)/dx = i, see Lemma 2, here]

(e) So putting this together gives us:

ln y = ix

]

(7) Now putting e the power of both sides, we get:

y = eix [Since eln y = y]

(8) Now combining step #1 with step #7 we get:

eix = cos x + isin x

QED

References

Monday, September 17, 2007

Cramer's Rule

In today's blog, I go over the classic result known as Cramer's Rule.

Theorem: Cramer's Rule

Let an n x n matrix A represent a system of linear equations such that AX = B.

Then it follows that if Det(A) ≠ 0, then X has only one unique solution and xi = det(A(Ci ↔ B))/det(A).

Proof:

(1) Assume that Det(A) ≠ 0

(2) Then A is invertible [see Theorem 4, here]

(3) Now, A-1B is a unique solution to AX = B [See Lemma 2, here]

(4) From Corollary 4.1, here, we have:

X = A-1B = 1/det(A)(adj A)B

(5) Therefore, we have:

xi = 1/det(A)*∑(j=1,n) enti,j(adj A))bj

[See Definition 1 here for definition of matrix multiplication if needed]

= 1/det(A)*∑(j=1,n) bj*(cofj,i(A)) [See Definition 1 here for definition of adj A]

(6) Using Corollary 4.1 here, this gives us:

1/det(A)*∑(j=1,n)bj*(cofj,i(A)) = 1/det(A)*det(A(Ci ↔ B)).

(7) Putting it all together gives us:

xi = 1/det(A)*det(A(Ci ↔ B)).

QED

References

Sunday, September 16, 2007

Homogeneous System of Linear Equations

In today's blog, I talk about homogeneous systems of linear equations. In essence, this extends a previous blog on systems of linear equations and their representation through matrices.

If you are not familiar with how matrices can represent systems of linear equations, start here. If you are not familiar with reduced echelon form, start here. If you are not familiar with the idea that each matrix is uniquely correlated with a specific matrix in reduced echelon form, start here.

Let Ax = b be a system of linear equations. This system is called homogeneous if and only if b=0.

In other words:

Definition 1: Homogeneous System of Linear Equations

Let Ax = b be a system of linear equations. This system of equations is called a homogeneous system of linear equations if and only if b = 0.

The important idea behind homogeneous systems of linear equations is that they always have at least one solution which is called the trivial solution.

For all matrices M, it is clear that if X is a vector of 0's, then MX = 0 where 0 is a vector of 0's. In other words, we are stating M0 = 0.

Definition 2: Trivial Solution of a Homogeneous System of Linear Equations

If MX=0 is a homogeneous system of linear equations, then it is clear that 0 is a solution. Since 0 is a solution to all homogeneous systems of linear equations, this solution is known as the trivial solution.

A nontrivial solution of a homogeneous system of linear equations is any solution to MX=0 where X ≠ 0.

The main conclusion that I will establish in today's blog is that homogeneous system of linear equations will have a nontrivial solution if and only if its determinant is 0.

To establish this point, let's consider some lemmas:

Lemma 1:

Let A be an n x n matrix that represents a homogeneous system of linear equations.

If A has an all-zero column, then there exists a nontrivial solution.

Proof:

(1) Assume that A has an all-zero column at j such that:

A =



so that for all i, ai,j=0.

(2) Let X =



such that for all i ≠ j, xi,1=0 and xj,1 = 1 [Note: xj,1 can equal any nonzero value and the argument will still hold]

(3) Let B=AX

(4) For any row i, Rowi(B) = ai,1*0 + ai,2*0 + ... + 0*xj,1 + ... + 0*ai,n = 0

(5) So, the X defined above is a nontrivial solution.

QED

Definition 3: Free Entry

For any nonzero row, a free entry is any nonzero entry that follows the leading entry, that is, the the free entry and the leading entry are in the same row but the free entry is a column that comes after the leading entry column.

Example 1: Leading Entry

Let A =



A is in reduced echelon form. In this case, row 1 has a leading entry at column 1 and a free entry at column 3. Row 2 has a leading entry at column 2 and a free entry at column 3. Row 3 does not have a leading entry or a free entry.

Lemma 2:

Let A be an n x n matrix in reduced echelon form that represents a homogeneous system of linear equations.

If A has a free entry, then there exists a nontrivial solution.

Proof:

(1) We can assume that A does not have any zero columns. If it did, then it would have a nontrivial solution by Lemma 1 above.

(2) By definition of reduced echelon form (see Definition 2, here), each column is either a leading entry for a row (one row has a 1 at this column and all other rows have a 0) or a free entry for a row (the column is never a leading entry for any row).

(3) We can now build a vector X in the following way:

(a) If a column j is a free entry (that is, it has at least one free entry in its column), then let Xj,1 = 1.

(b) If a column j has a leading entry at row t, then let Xj,1 = -(at,j+1 + ... + at,n)

NOTE: By the definition of reduced echelon form, if any row of a column is a leading entry, then all other rows of that column are 0.

(4) Let B = AX.

(5) Then, for any nonzero row i in B:

(a) Let li(i) be the column which is the leading index for row i.

(b) Rowi(B) = ai,1*x1 + ... + ai,li(i)*[-(ai,li(i)+1 + ... + ai,n)] + ... + ai,n*xn

(c) Now, for row i, we know that all columns before li(i) are zero and the entry at li(i)=1, so we have:

Rowi(B) = 0*x1 + ... + ai,li(i)*[-(ai,li(i)+1 + ... + ai,n)] + ... + ai,n*xn =

= 1*[-(ai,li(i)+1 + ... + ai,n)] + ... + ai,n*xn

(d) For any column j that is a leading entry for another row, ai,j=0 (from the definition of reduced echelon form), so we know that the sum for all columns after li(i) are:

ai,li(i)+1 + ... + ai,n

[Note: This is because either ai,j=0 or xj=1 when ai,j is nonzero]

(e) So, we get that:

Rowi(B) = -(ai,li(i)+1 + ... + ai,n) + (ai,li(i)+1 + ... + ai,n) = 0

(6) We know that X is not the trivial solution since, by assumption, we assumed that A has a free entry. It therefore follows that X ≠ 0 [since if j is the column with the free entry, then xj=1]

(7) Therefore, it follows that A has a nontrivial solution.

QED

Lemma 3:

If an n x n matrix A is in reduced echelon form and A has n nonzero rows, then A = In.

Proof:

(1) If A has n nonzero rows, then each row has a leading entry.

(2) This means that all n columns must have a leading entry so that for each row, there is only nonzero entry, the leading entry which is 1.

(3) But, we also know that the first row must have a leading entry in the column before the leading entry of the second row and so on.

(4) The only way that this can occur is if the leading for row 1 is in column 1 and the leading entry for row 2 is in column 2 and so on since this is the only way to order the n columns which make up the n leading entries.

(5) Thus, A must equal In (see Definition 1, here for definition of the Identity Matrix).

QED

Lemma 4:

Let A be an n x n matrix in reduced echelon form that represents a homogeneous system of linear equations.

If A has n nonzero rows, then there is only one solution: the trivial solution.

Proof:

(1) If A has n nonzero rows and A is in reduced echelon form, then by Lemma 3 above, A = In.

(2) But then we have:

InX = 0

(3) By the definition of the In (See Definition 1, here), we know that InX = X

(4) Thus we have:

0 = InX = X

QED

Lemma 5:

If an n x n matrix A is in reduced echelon form and A has a zero row, then A has a nontrivial solution.

Proof:

(1) If A has a zero row, then then there are at most n-1 columns with leading entries and one column cannot have a leading entry since:

(a) Assume that all columns have a leading entry.

(b) Then there are necessarily n leading entries

(c) But then since each leading entry must be on its own row, there must be n nonzero rows.

(d) But this is impossible since there is at least one zero row so we have a contradiction and we reject our assumption in (a).

(2) But if a column does not have a leading entry, then it is necessarily an all-zero column or it contains free entries.

(3) If it is an all zero column, the A has a nontrivial solution by Lemma 1 above. If it contains free entries, then A has a nontrivial solution by Lemma 2 above. Either way, A has a nontrivial solution.

QED

Theorem 6:

An n x n matrix that represents a homogeneous system of linear equations has a nontrivial solution if and only if its determinant = 0

Proof:

(1) Assume that a matrix has a nontrivial solution.

(2) Assume that its determinant ≠ 0

(3) By Cramer's Rule (see Theorem, here), if determinant ≠ 0, then the matrix has a unique solution. But if matrix has a unique solution, then it does not have a nontrivial solution (since we know that the trivial solution must be this unique solution).

(4) Therefore we have a contradiction and we reject our assumption at step #2 and conclude that determinant = 0.

(5) Assume that det(A) = 0

(6) Then it follows that A is not invertible. [See Theorem 4, here]

(7) Since every matrix has a reduced echelon form (see Theorem 1, here), let Ar be the reduced echelon form for A.

(8) Since A is not invertible, Ar cannot be invertible since:

(a) Assume that Ar is invertible.

(b) Since A is row equivalent Ar, there exists a matrix P that is a product of elementary matrices such that A = PAr [See Theorem 5, see here]

(c) Since each of the elementary matrices are invertible [see Lemma 4, here] and P is the product of elementary matrices, P is invertible. [See Corollary 3.1, here]

(d) Since P is invertible and Ar is invertible by assumption, it follows that A must be invertible [See Corollary 3.1, here]

(e) But A is not invertible so we have a contradiction and we reject our assumption in step #8a.

(9) Since Ar is not invertible, it is not equal to In (since In is invertible to itself), and Ar must have a zero row since:

(a) Assume Ar did not have a zero row.

(b) Then Ar = In [See Lemma 3, above]

(c) But Ar ≠ In so we have a contradiction and we reject our assumption at step #9a.

(10) But if Ar has a nonzero row, then Ar has a nontrivial solution. [See Lemma 5 above]

(11) And if Ar has a nontrivial solution, then A has a nontrivial solution since:

(a) By from the properties of row equivalence, Ar = PA [See Theorem 5, here]

(b) Let X be a nontrivial solution for Ar such that:

ArX = 0

(c) Then applying step #11a, we have:

ArX = PAX = 0

(d) Now P is invertible (see step #8c above), so we can multiply P-1 to both sides to get:

P-1PAX = AX = P-10=0

(e) So, if X is a nontrivial solution for Ar, it is also necessarily a nontrivial solution for A.

QED

Corollary 6.1:

An n x n matrix that represents a homogeneous system of linear equations has only the trivial solution if and only if its determinant ≠ 0

Proof:

(1) Assume that det(A)≠0

(2) Then A is invertible. [See Theorem 4, here]

(3) So, we have:

A-1AX = A-10

(4) Now, A-1AX = InX = X [See Definition 1, here]

(5) Likewise A-10 = 0

(6) So, X = 0

(7) Assume that the only solution is the trivial solution.

(8) Assume that Det(A) = 0

(9) But by Theorem 6 above, AX=0 must have a nontrivial solution.

(10) But this is a contradiction so we reject our assumption in step #8.

(11) Therefore Det(A) ≠ 0

QED