Math Refresher

Monday, September 11, 2006

Absolute value for complex powers

In today's blog, I show the details behind the following result:

if s is a complex number and a = Re(s), then abs(n^s) = n^a
Definition 1: complex number

Any value expressible in the form a + bi where a,b are real numbers (see here for definition of real numbers) and i²=-1.

It should be pointed out that cos x + isin x fits the definition where a = cos x and b = sin x.

Definition 2: conjugate of a complex number

The conjugate of a complex number a + bi is the complex number a - bi. The conjugate of a complex number a - bi is the complex number a + bi.

If s is a complex number, its conjugate is denoted as s or as s'.

Definition 3: absolute value for complex numbers

abs(a + bi) = √a² + b²

This definition may seem strange at first but it is a natural generalization from the concept of absolute value for real numbers. One of the key functions of an absolute value for real numbers is that it provides the distance between the two real values. For example, the distance between two real numbers x,y is abs(x - y). This is true regardless of whether x is larger or y is larger.

Subtraction is the method for distance on a line. With complex numbers, we are dealing with a plane. In this situation, distance comes from using the Pythagorean Theorem (c² = a² + b²) where the distance is between two (x,y) points is two absolute values abs(x1 - x2) and abs(y1 - y2). Then, the distance between the points is the triangulation of those two points which is sqrt(abs(x1 -x2)² + abs(y1 - y2)²). [See David Joyce's article for more details]

Definition 4: Re(s)

Re(s) is the real portion of a complex number. If s is a complex number = a + bi, then Re(s)=a.

Definition 5: Im(s)

Im(s) is the imaginary portion of a complex number. If s is a complex number = a + bi, then Im(s) = b.

Lemma 1: abs(cos x + isin x) = 1

Proof:

(1) abs(cos x + isin x) = √cos²(x )+ sin²(x)

(2) Now, cos²(x) + sin²(x) = 1. [See here for details]

(3) So abs(cos x + isin x) = √1 = 1.

QED

Theorem 2: The absolute value of complex power

If s is a complex number and n a positive real and a = Re(s), then:

abs(n^s) = n^a

Proof:

(1) There exist real numbers a,b such that s = a + bi [See definition of a complex number above]

(2) n^s = n^{a + bi} = e^{(a + bi)ln(n)} [See here for review of ln and e]

(3) e^{(a + bi)ln(n)} = e^{(a)ln(n) + (bi)ln(n)} = e^(a)ln(n)*e^(bi)ln(n) = (n^a)e^(bi)ln(n)

(4) Using Euler's Formula,we have:

e^(bi)ln(n) = cos([b]ln(n)) + isin([b]ln(n))

(5) Let's clean it up by letting x = b*ln(n) so that we have:

cos([b]ln(n)) + isin([b]ln(n)) = cos(x) + isin(x).

(6) So putting it all together, we have:

n^s = (n^a)(cos(x) + isin(x))

(7) Since n is a positive real and a is a real, we know that abs(n^a) = n^a

So that:

abs(n^a)(cos(x) + isin(x))) = n^a*abs(cos(x) + sin(x))

(8) Using Lemma 1 above, we now have:

abs(n^s) = n^a where a = Re(s).

QED

Lemma 3: Triangle Inequality for Complex Numbers

If s,t are complex numbers, abs(s + t) ≤ abs(s) + abs(t)

Proof:

(1) There exists a,b,c,d (see Definition of complex number above) such that:

s = a + bi
t = c + di

(2) abs(s + t) = abs((a+c) + (b+d)i = √(a+c)² + (b+d)² [See Definition of absolute value above for complex numbers]

(3) abs(s + t)² = (a+c)² + (b + d)² = [(a + c) + (b + d)i][(a + c) - (b + d)i] = [s + t][s + t]' [See Definition 2 above for details on conjugates of complex numbers]

(4) [s + t]' = (a + c) - (b + d)i = (a - bi) + (c - di) = s' + t'

(5) abs(s + t)² = [s + t][s' + t'] = ss' + [st' + s't] + tt'

(6) We can also see that:

ss' = (a + bi)(a - bi) = a² + b² = abs(s)²

tt' = (c + di)(c - di) = c² + d² = abs(t)²

[st' + s't] = (a + bi)(c - di) + (a - bi)(c + di) = ac - adi + bci + bd + ac + adi - bci + bd =
= 2ac + 2bd

s't = (a + bi)(c - di) = ac + bci -adi + bd

(7) So, ss' + [st' + s't] + tt' = abs(s)² + 2Re(s't) + abs(t)²

(8) 2*abs(s't) ≥ 2Re(s't) since:

(a) abs(s't) = √[(ac + bd) + (bc - ad)i][(ac + bd) - (bc - ad)i] = √[(ac + bd)² + (bc - ad)²

(b) ac + bd = √(ac + bd)²

(c) Since (bc - ad)² ≥ 0, it is clear that 2*abs(s't) ≥ 2&Re(s't)

(9) So that we have:

abs(s + t)² ≤ abs(s)² + 2*abs(s't) + abs(t)²

(10) abs(s't) = abs(s)*abs(t) since:

abs(s)*abs(t) = √a² + b²*√c² + d² = √a²c² + a²d² + b²c² + b²d²

abs(s't) = √[(ac + bd)² + (bc - ad)² = √a²c² + 2abcd + b²d² + b²c² + a²d² - 2abcd = √a²c² + a²d² + b²c² + b²d²

(11) This gives us:

abs(s + t)² ≤ abs(s)² + 2*abs(s)*abs(t)+ abs(t)² = (abs(s) + abs(t))²

(12) Squaring both sides gives us:

abs(s + t) ≤ abs(s) + abs(t) since abs(s + t), (abs(s) + abs(t)) are both nonnegative.

QED

References

David Joyce, Dave's Short Course on Complex Numbers
Alexander Bogomolny, "Useful Inequalities among Complex Numbers", Cut-The-Knot.org

Saturday, September 09, 2006

Bolzano-Weierstrass Theorem

The Bolzano-Weierstrass Theorem is a very important theorem in the realm of analysis. It was first proved by Bernhard Bolzano but it became well known with the proof by Karl Weierstrass who did not know about Bolzano's proof. In light of this history, the proof gets its current name. I use it in my proof for the Cauchy Criterion which also an important theorem in analysis.

It says:

every bounded sequence has a convergent subsequence

To make heads or tails of this, it is necessary to offer some definitions of bounded sequence and convergent subsequence. Since this is one of the subtleties of analysis, I will attempt to provide a complete formal definition. Interspersed, I will provide some comments that will make the ideas clear even if you wish to skip some of the formalisms.

Let's start with sequence. This one is a tougher nut to crack, then it might seem. From one perspective, a "sequence" is just a set of numbers but from another perspective, this definition in itself is too limiting. For example, what about a sequence of points on a line?

To apply to all these domains, we will use an abstraction called a metric space and to define this, we must first define a metric.

Definition 1: metric

A metric is a binary function d(a,b) on a set X such that for any a,b,c ∈ X:

(a) d(a.b) is a real number.

(b) d(a,b) ≥ 0

(c) d(a,b) = 0 if and only if a = b

(d) d(a,b) = d(b,a)

(e) d(a,c) ≤ d(a,b) + d(b,c)

In other words, a metric is a distance function. For real numbers, the standard metric is an absolute subtraction. The distance between any two numbers x,y is abs(x - y).

Definition 2: Metric Space

A metric space is a pairing of a set X with a metric, that is, a distance function. Any element in X can be thought of as a point in the metric space.

From this perspective, all the number systems (integers, reals, rationals) are examples of metric spaces as are geometric coordinates (lines and planes).

Definition 3: Sequence

A sequence is a mapping from the set of natural numbers to the points of a given metric space.

In other words, there is a first element, a second element, a third element, etc.

For example:

the set of even numbers = {2, 4, 6, ...}

Definition 4: Bounded Sequence

A sequence a_n is bounded if there exists some number k such that abs(a_n) ≤ k for all elements in the sequence.

By this definition, a bounded sequence has both a least upper bound and a greatest lower bound.

OK, good. To continue on with formal definitions, we now need a definition of strictly increasing.

Definition 5: Strictly Increasing

A function f(x) is said to be strictly increasing if x is less than y implies f(x) is less than f(y).

I now use the definition of strictly increasing in the next definition.

Definition 6: Subsequence

If (a_n) is sequence, then a subsequence is (a_{n_r}) where each n_r is a strictly increasing sequence of natural numbers.

The main idea here is that if a sequence is a mapping of the set of natural numbers to points in a metric space. Then a subsequence is a mapping of a set of natural numbers to a subset of natural numbers to points in a metric space.

Here's an example. Let's say we have the sequence of natural numbers = (1,2,3,4,...)

In this case, we have a mapping from the set of natural numbers to the set of natural numbers. The first element is 1, the second element is 2, and so on. Remember, the set of natural numbers is a metric space, that is, they are a set combined with a distance function for any two elements from that set.

Now, let's say we have the sequence of odd numbers which is subsequence of the natural numbers so that we have (1,3,5,7,9,..)

This is a mapping from the natural numbers to the natural numbers to a metric space in the sense that the first element of the odd numbers, 1, maps to the first element of the sequence of integers which maps to 1. The second element of the odd numbers, 3, maps to the third element of the sequence of integers which maps to 3.

The main idea here is that a subsequence maintains the order of the sequence but deletes some of the elements. So, all subsequences are sequences in themselves but they also have a mapping to the elements of another sequence.

Definition 7: Convergent Sequence

A sequence (a_n) in the metric space (X,d(a,b)) is said to be convergent if there exists a point x ∈ X such that:

for every real number ε greater than 0, there exists a natural number N such that if n greater than N, d(a_n,x) is less than ε.

The notation for this is:

lim (s_n) = x.

In other words, if we are talking about a sequence of real numbers where the distance function d(a,b) is just subtraction and the metric space is just the set of real numbers, then this says that for any convergent sequence, there is a limit for this sequence (which is a real number) such that we can pick a point in the sequence where all elements of the remaining elements of the sequence are arbitrarily close to the limit. In other words, the sequence is said to converge to this limit.

For the proof below, we also need:

Definition 8: Monotone Sequence

A sequence s_n is monotonic if it meets one of the four following conditions:

(1) Monotonically Increasing: i greater than j → s_i greater than s_j

(2) Monotonically Decreasing: i greater than j → s_i is less than s_j

(3) Monotonically Nondecreasing: i greater than j → s_i ≥ s_j

(4) Monotonically Nonincreasing: i greater than j → s_i ≤ s_j

The main idea behind a monotone sequence is that it moves in a consistent direction. It is either gradually increasing, gradually decreasing, or just staying the same without ever regressing.

Now, with these definitions aside, we are ready to start on the proof of the Bolzano-Weierstrass Theorem. We will also need some lemmas to begin.

Lemma 1: All bounded monotone sequences converge.

Proof:

(1) Assume that S is a bounded nondecreasing sequence = s₁, ..., s_n. (We will be able to make an analogous argument for bounded nonincreasing sequences so we only need to prove this one case)

(2) Let b = sup S

NOTE: sup S means the supremum for the sequence S which is the least upper bound for the sequence S. We know that this exists since S is bounded.

For nonincreasing sequences, we would let b = inf(S) where inf(S) is the infimum for the sequence S which is the greatest lower bound for the sequence S.

(3) Let ε be any positive number.

(4) Then, there is some N such that s_N is greater than b - ε

We know that this exists otherwise b would not be the least upper bound since b - ε is less than b. We can make an analogous argument using the greatest lower bound for a bounded nonincreasing sequence.

(5) Since S is nondecreasing, for all n ≥ N, s_n is greater than b - ε. [See Definition of monotone sequence above]

(6) S is also bounded by b so we have:

b is greater than s_n which is greater than b - ε

(7) But this implies that abs(s_n - b) is less than ε since:

if we subtract b from all sides then we get:

0 is greater than s_n - b which is greater than -ε

Since ε is greater than 0, we have:

s_n - b is between -ε and +ε.

(8) So, by the definition of a convergent sequence, we have:

lim(s_n) = b.

QED

Lemma 2: Every sequence has a monotonic subsequence

Proof:

(1) Let us call a term s_i in a sequence S dominant if it is greater than all the terms that follow it.

(2) It is clear that each sequence either has a finite number of dominant terms or an infinite number of dominant terms.

(3) Let us start by handling the case where a sequence S has an infinite number of dominant terms.

(4) We can now form a subsequence that only includes these dominant terms. [See definition of subsequence above]

(5) Then for every S_{n_k} that makes up the subsequence, S_{n_k} is greater than S_{n_k+1} which means that the subsequence is monotonic decreasing sequence. [See definition of monotone sequence above]

(6) Now, let's handle the case where there is a finite number of dominant terms.

(7) Let's label the first term after the last dominant term (since there are only a finite number of them), n₁. If there aren't any dominant terms, then let's label the first term in the sequence n₁.

(8) Now since n₁ is not dominant, there must be another term in the sequence which is greater than n₁. We can call this next term n₂. But n₂ is not dominant so there must be an n₃ and so on.

(9) We now have constructed a nondecreasing monotone subsequence. [See definition of monotone sequence above]

QED

Theorem: Bolzano-Weierstrass Theorem

Every bounded sequence has a convergent subsequence

Proof:

(1) Let (s_n) be a bounded sequence.

(2) By Lemma 2 above, we know that it has a monotonic subsequence.

(3) By Lemma 1 above, the monotonic subsequence converges.

QED

References

Bolzano-Weierstrass Theorem, Wikipedia.com
Bolzano-Weierstrass Theorem, PlanetMath.org

Cauchy's Criterion

Cauchy's criterion is a well known criteria for when an infinite sum converges, that is, has a finite limit.

Augustin Cauchy was not the first to come up with the criteria. Leonhard Euler, for example, used a similiar criteria. It may be that the significance of the criteria became appreciated in the context of Cauchy's great work on the foundations of calculus.

Definition 1: Cauchy Sequence

A sequence s_i is Cauchy Sequence if and only if given any positive number ε, there exists an integer N such that if m,n are greater than N, then absolute(s_m - s_n) is less than ε

In other words, elements of the sequence get arbitrarily close to one another.

I will need a few properties of absolute inequalities:

Lemma 1: absolute(a - b) ≤ absolute(a) + absolute(b)

Proof:

(1) Case I: a - b is nonnegative

So abs(a-b) = a - b

If b ≥ 0, then a - b ≤ a ≤ abs(a) + abs(b)

If b is less than 0, then a - b = a + abs(b) ≤ abs(a) + abs(b)

(2) Case II: a - b is negative

So abs(a-b) = -(a-b) = b - a = abs(b - a)

Using step #1, we know that abs(b - a) ≤ abs(b) + abs(a) = abs(a) + abs(b)

So that:

abs(a - b) ≤ abs(a) + abs(b)

QED

Lemma 2: abs(a) - abs(b) ≤ abs(a - b)

Proof:

(1) Case I: a - b is nonnegative so that abs(a - b) = a - b

If a ≥ 0, then abs(a - b) = a - b ≥ abs(a) - abs(b)

NOTE: It is = except for the case where b is negative.

If a is less than 0, then abs(b) is greater than abs(a) and abs(a) - abs(b) must be a negative number.

(2) Case II: a - b is negative and abs(a - b) = -(a - b) = b - a

If a is ≥ 0, then abs(b) is greater than a and abs(a) - abs(b) is a negative number.

If a is less than 0 and b is less than 0, then b - a = b + abs(a) = abs(a) - abs(b) so that abs(a - b) = abs(a) - abs(b)

If a is less than 0 and b ≥ 0, then b - a = b + abs(a) = abs(a) + abs(b) ≥ abs(a) - abs(b).

QED

Here are some properties of Cauchy Sequences which I will use below:

Lemma 3: Any convergent sequences is a Cauchy Sequence

Proof:

(1) Let a_i be a convergent sequence (that is, as a_i gets larger, it approaches a limit) so that it's limit = L. [See definition 7, here for definition of a convergent sequence]

(2) So from the above definition, we know for any positive number ε, there exists a positive number N such that:

if n is greater than N, then absolute(a_n - L) is less than ε

(3) So, for a value (1/2)ε, there exists an integer N such that if n is greater than N, absolute(a_n - L) is less than (1/2)ε

(3) So let's assume that we have two integers m,n both greater than N.

(4) This means that in both cases absolute(a_m - L) is less than (1/2)ε and absolute(a_n - L) is less than (1/2)ε

(5) This gives us:

absolute(a_m - a_n) = absolute([a_m - L] - [a_n - L]) ≤ absolute(a_m - L) + absolute(a_n - L) [See Lemma 1 above]

(6) Finally,

absolute(a_m - L) - absolute(a_n - L) is less than (1/2)ε + (1/2)ε = ε

QED

Lemma 4: A Cauchy Sequence has a bound

Proof:

(1) Let (a_i) be a Cauchy Sequence.

(2) Then, for any positive number ε greater than 0, there is an integer N such that:

for any integer m,n ≥ N, abs(a_n - a_m) is less than ε [See Definition of a Cauchy Sequence above]

(3) So that if we ε = 1 (since ε can be any positive number), we have:

abs(a_n) - abs(a_m) ≤ abs(a_n - a_m) less than 1 for all n,m ≥ N. [See Lemma 2 above]

(4) Let m = N (since m can be any integer ≥ N), then we have:

abs(a_n) - abs(a_N) is less than 1 which means that:

abs(a_n) is less than abs(a_N) + 1 for n ≥ N.

(5) Now, let n = N (since n can be any integer ≥ N), then we have:

abs(a_N) - abs(a_m) is less than 1 which means that:

abs(a_m) is greater than abs(a_N) - 1 for all n ≥ N.

(6) So, for all n, we have:

abs(a_n) is less than max { abs(a₁), ..., abs(a_N-1), abs(a_N) + 1 }

and

abs(a_n) is greater than min { abs(a_n₁), ..., abs(a_N-1), abs(a_N) - 1 }

(7) This shows that for all finite subsets of the sequence, there exists a bound for a_i where upper bound = max { abs(a₁), ..., abs(a_N-1), abs(a_N) + 1 } and a lower bound = min { abs(a₁, ..., abs(a_N-1), abs(a_N)-1 }

QED

Lemma 5: If a Cauchy Sequence has a subsequence convergent to b, then the Cauchy sequence itself converges to b.

Proof:

(1) Let a_n be a Cauchy sequence with the subsequence a_{i_n} convergent to b.

(2) By the definition of convergence, we know that for a positive number ε/2, there exists an integer M such that for all n ≥ M abs(a_{i_n} - b) ≤ ε/2.

(3) By the definition of a Cauchy sequence, we know that there exists an integer n₀ such that for all m,n ≥ n₀, abs(a_n - a_m) ≤ ε/2.

(4) Now, if i_M (this is the start of the subsequence that converges) is less than n₀, we can always find a M' which is greater than M such that i_M' ≥ n₀.

We can assume this since we are assuming an infinite subsequence.

(5) So for all n ≥ n₀, we have:

abs(a_n - b) = abs(a_n - a_{i_M'} + a_{i_M'} - b) ≤ abs(a_n - a_{i_M'}) + abs(a_{i_M'} - b) [See Lemma 1 above]

(6) Now, abs(a_n - a_{i_M'}) + abs(a_{i_M'} - b) is less than ε/2 + ε/2 = ε

We know that abs(a_n - a_{i_M'}) is less than ε/2 from the definition of a Cauchy sequence.

We know that abs(a_{i_M'}) is less than ε/2 from the definition of the convergent sequence.

(7) Now putting this all together gives us that:

abs(a_n - b) ≤ ε

Which by definition (see Definition 7, here) means that lim(a_n) = b.

QED

Lemma 6: Every real Cauchy sequence is convergent.

Proof:

(1) By Lemma 4 above, every Cauchy sequence is bounded.

(2) So, by the Bolzano-Weierstrass Theorem (see Theorem, here), every Cauchy sequence has a convergent subsequence.

(3) So, by Lemma 5 above, every Cauchy sequence is convergent.

QED

Lemma 7: A sequence of reals converges if and only if it is a Cauchy sequence

Proof:

(1) By Lemma 6 above, we know that a Cauchy sequence is convergent.

(2) By Lemma 3 above, we know that a convergent sequence is a Cauchy sequence.

QED

Here is the Criterion:

Theorem: Cauchy's Criterion

A series a_i is convergent (that is, has a finite limit) if and only if for every positive number ε, there exists a positive integer N such that:

for all n greater than N and p ≥ 1:

absolute(a_n+1 + a_n+2 + ... + a_n+p) is less than ε

Proof:

(1) Let s_n = ∑ a_i

The assumption here is that i ranges from 0 to n.

(2) s_n converges if and only if it is a Cauchy Sequence. [See Lemma 7 above]

(3) Assume than s_n is a Cauchy Sequence.

(4) Then, for every positive number ε, there exists a number N such that for all integers n,m greater than N, absolute(s_m - s_n) is less than ε [See Definition 1 above]

(5) Let's assume that m is greater than n.

At this point, we've made no assumption about m or n and this is consistent with our assumption in step #3.

(6) We know that there exists an integer p ≥ 1 such that m = n + p

(7) Based on the definition of s_n, we can see that:

absolute(s_m - s_n) = absolute(s_n+p - s_n) = absolute(a_n+1 + a_n+2 + ... + a_n+p)

(8) Now, using step #6, we can see that ∑ a_i is convergent if and only if the conditions of the given apply.

QED

References

Somsack Chaitesipaseut, Cauchy's Criterion for Convergence
Cauchy's Criterion for Convergence, PlanetMath.org
Cauchy Sequences
Cauchy Sequences and the Completeness of the Reals, Mathology

Thursday, August 31, 2006

sin x < x < tan x for x in (0,π/2)

In today's blog, I will show how it is possible to use a unit circle to establish that if x is greater than 0 and less than π/2, then sin x is less than x which is less than tan x.

Theorem: if x is in (0,π/2), then sin x is less than x is less than tan x

Proof:

(1) Let O be a circle with radius = 1.

(2) Let ∠ CDO and ∠ BAO be right angles.

(3) Let x be the angle at ∠ COD

(4) We can see the following values apply to this diagram

cos x = adjacent/hypotenuse = OD/OC = OD/1 = OD

sin x = opposite/hypotenuse = CD/OC = CD/1 = CD

tan x = sin x/cos x = opposite/adjacent = AB/OA =AB/1 = AB

(5) Area of triangle OAC = (1/2)(base)(height) = (1/2)(OA)(CD) = (1/2)(1)(sin x) = (1/2)(sin x) [See Lemma 2, here for proof]

(6) Area of sector OAC = (1/2)(x)(radius)² = (1/2)(x)(1)² = (1/2)x [See Lemma 2, here for proof]

(7) Area of triangle OAB = (1/2)(base)(height) = (1/2)(OA)(AB) = (1/2)(1)(tan x) = (1/2)(tan x) [See Lemma 2, here for proof]

(8) By the diagram above, it is clear that triangle OAB is greater than sector OAC which is greater than triangle OAC.

(9) This then gives us that (1/2)(tan x) is greater than (1/2)(x) which is greater than (1/2)(sin x).

(10) Dividing all values by (1/2) gives us:

tan x is greater than x which is greater than sin x.

QED

References

Use of Squeezing Theorem to Find Limits, analyzemath.com

Tuesday, August 29, 2006

Cauchy's Mean Value Theorem

I use Cauchy's Mean Value Theorem in my proof of L'Hopital's Rule (proof to be added later). Augustin-Louis Cauchy used this proof to as part of his effort to make calculus more rigorous. For those interested in learning more about Cauchy's role in reworking the foundations of calclulus, check out The Origins of Cauchy's Rigorous Calculus by Judith V. Grabiner.

Theorem Cauchy's Mean Value Theorem

If f(x),g(x) are continuous functions on the closed interval [a,b] and differentiable on (a,b)

Then, there exists a number c in (a,b) such that:

[f(b) - f(a)]g'(c) = [g(b)-g(a)]f'(c)

Proof:

(1) Let us define a function h(x) such that:

h(x) = [g(b) - g(a)]*[f(x) - f(a)] - [f(b) - f(a)]*[g(x) - g(a)]

(2) h(a) = h(b) = 0 since:

h(a) = [g(b) - g(a)]*[f(a) - f(a)] - [f(b) - f(a)]*[g(a) - g(a)] = [g(b) - g(a)]*0 - [f(b) - f(a)]*0 = 0

h(b) = [g(b) - g(a)]*[f(b) - f(a)] - [f(b) - f(a)]*[g(b) - g(a)] = 0

(3) h(x) is continuous in the closed interval [a,b] since:

(a) Using the Constant Law (see Lemma 1, here), we know that we can treat the values f(a),f(b),g(a),g(b), -1 as continuous functions.

(b) Using the Multiplication Law (see Lemma 3, here), we can treat -g(a), -f(a) as continuous functions.

(c) Using the Addition Law (see Lemma 2, here), we know that the following are continuous functions:

g(b) + [- g(a)] = g(b) - g(a)
f(x) + [- f(a)] = f(x) - f(a)
f(b) + [-f(a)] = f(b) - f(a)
g(x) + [-g(a)] = g(x) - g(a)

(d) Using the Multiplication Law and the Addition Law, we can see that h(x) is a continuous function

Since h(x) = [g(b)-g(a)]*[f(x) - f(a)] - [f(b) - f(a)]*[g(x) - g(a)]

(4) h(x) is also differentiable on (a,b) since:

NOTE: The detail here is parallel to the detail in step #3.

(a) We know that all constants are differentiable (see Lemma 1, here) so this means that f(a),f(b),g(a),g(b), -1 are all differentiable on (a,b)

(b) We know that the product of all differentiable functions are differentiable (see Lemma 4, here) so this means that -g(a), -f(a) are differentiable on (a,b)

(c) We also know that the addition of differentiable functions are differentiable (see Lemma 3, here) so this means that the following are all differentiable on (a,b):

g(b) + [- g(a)] = g(b) - g(a)
f(x) + [- f(a)] = f(x) - f(a)
f(b) + [-f(a)] = f(b) - f(a)
g(x) + [-g(a)] = g(x) - g(a)

(d) Finally, from the principles that we have already reviewed we know that h(x) is differentiable on (a,b) since:

h(x) = [g(b)-g(a)]*[f(x) - f(a)] - [f(b) - f(a)]*[g(x) - g(a)]

(e) We can now define h'(x)

Let u(x) = [g(b) - g(a)]*[f(x) - f(a)]

Let v(x) = [f(b) - f(a)]*[g(x) - g(a)]

u'(x) = [g(b) - g(a)][f'(x) - 0] + [0 - 0]*[f(x) - f(a)] = [g(b) - g(a)]f'(x) [See Lemma 4, here]

v'(x) = [f(b) - f(a)][g'(x) - 0] + [0 - 0]*[g(x) - g(a)] = [f(b) - f(a)]g'(x) [See Lemma 4, here]

h(x) = u(x) - v(x)

h'(x) = u'(x) - v'(x) = [g(b) - g(a)]f'(x) - [f(b) - f(a)]g'(x) [See Lemma 3, here]

(5) Using Rolle's Theorem, we know that there exists a point c such that:

h'(c) = 0 and c in (a,b)

(6) Now, combining our result for h'(x) in step #4 with step #5, we have:

h'(c) = [g(b) - g(a)]f'(c) - [f(b) - f(a)]g'(c) = 0

(7) Adding [f(b) - f(a)]g'(c) to both sides gives us:

[g(b) - g(a)]f'(c) = [f(b) - f(a)]g'(c)

QED

Corollary: Mean Value Theorem

If f(x) is a continuous function on the closed interval [a,b] and differentiable on the open interval (a,b), then there exists a point c in (a,b) such that:

f(b) - f(a) = f'(c)(b-a)

Proof:

(1) Let g(x) = x

(2) Then g'(x) = 1

(3) Using Cauchy's Mean Value Theorem above, we see that there exists a value c such that:

[f(b) - f(a)]g'(c) = [g(b)-g(a)]f'(c)

(4) Since g'(c) = 1, we see that:

f(b) - f(a) = [g(b) - g(a)]f'(c)

(5) Since g(x)=x, we see that:

f(b) - f(a) = (b - a)f'(c)

QED

References

L'Hopital's Rule, Ask Dr. Math
C. H. Edwards, Jr. and David E. Penney, Calculus and Analytic Geometry, Prentice Hall, 1990.

L'Hopital's Rule

In today's blog, I present a proof for L'Hopital's Rule which is also known as L'Hospital's Rule
which states that under certain circumstances, lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x). I use it for example in the proof that ∑ 1/n² = π²/6 (Proof to be added later).

To prove this theorem, I will need to start with some lemmas that show that L'Hopital's Rule is true for specific cases.

Lemma 1: L'Hopital's Rule for 0/0 limits

Let f(x),g(x) be two functions that are differentiable in a deleted neighborhood (b,a) such that g'(x) is a nonzero, finite real number in that neighborhood, that is, when b is less than x is less than a.

If:

lim(x → a) f(x) = 0 and lim(x → a) g(x) = 0 and lim(x → a) f'(x)/g'(x) = a finite, real number

Then:

lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x)

Proof:

(1) Let f(x),g(x) be continuous functions such that f(a)=0, g(a)=0

(2) Using Cauchy's Mean Value Theorem (see Theorem, here), for any x, there exists a point c such that x is less than c which is less than a and:

[f(a) - f(x)]g'(c) = [g(a)-g(x)]f'(c)

(3) We can rerrange the equation to give us:

f'(c)/g'(c) = [f(a) - f(x)]/[g(a) - g(x)]

(4) Since f(a)=0 and g(a) = 0, this gives us:

f'(c)/g'(c) = f(x)/g(x)

(5) Let L = lim(x → a) f'(x)/g'(x) [We know that L is a finite real number from the given]

(6) Let ε be any positive real value.

(7) By definition of limits (see Definition 1, here), for ε, there exists a δ such that:

if (x - a) is between -δ and +δ, then f'(x) - L is between -ε and +ε

(8) Since c is between x and a, we can see that as x moves toward a so does c. This gives us:

lim(c → a) f'(c)/g'(c) = L since

if (c-a) is between -δ and +δ, then f'(c) - L is between -ε and +ε

(9) But from step #4, since f(x)/g(x) = f'(c)/g'(c), we can see that as x moves toward a, c likewise moves toward a and we have:

lim(x → a) f(x)/g(x) = L since:

for ε, there exists a δ such that:

if (x - a) is between δ and +δ, then (x-c) is also between -δ and +δ and since f(x)/g(x) = f'(c)/g'(c), we have that f(x)/g(x) - L is between -ε and +ε since f(c)/g'(c) - L is between -ε and +ε [See step #7]

QED

Lemma 2: L'Hopital's Rule for ∞/∞ limits

Let f(x),g(x) be two functions that are differentiable in a deleted neighborhood (b,a) such that g'(x) is a nonzero, finite real number in that neighborhood, that is, when b is less than x is less than a.

If:

lim(x → a) f(x) = ∞ and lim(x → a) g(x) = ∞ and lim(x → a) f'(x)/g'(x) = a finite, real number

Then:

lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x)

Proof:

(1) Let L = lim (x → a) f'(x)/g'(x) where L is a finite real number.

(2) Let ε be any positive value.

(3) From step #1, there exists δ such that if x -a is between -δ and δ, then f'(x) - L is between -ε and ε [Definition of limit, see here]

(4) Let b = a - δ

(5) Using Cauchy's Mean Value Theorem (see here), we know that there exists a value c such that c is in (a,b) and:

[f(a) - f(x)]g'(c) = [g(a)-g(x)]f'(c)

(6) Multiplying both sides by 1/([g'(c)][g(a) - g(x)]) gives us:

[f(a) - f(x)]/[g(a) - g(x)] = f'(c)/g'(c)

(7) Likewise, we can multiply (-1)/(-1) to both sides to get:

[f(x) - f(a)]/[g(x) - g(a)] = f'(c)/g'(c)

(8) Since c is in (a,b) and c moves toward a as x moves toward a, we have:

lim (c → a) f'(c)/g'(c) = lim (x → a) f'(x)/g'(x) = L

(9) But this means that:

lim (x → a) ([f(x) - f(a)]/[g(x) - g(a)]) = lim (c → a) f'(c)/g'(c) = L

(10) So using the definition of a limit we have:

If x - a is between -δ and +δ, then:

[f(x) - f(a)]/[g(x) - g(a)] - L is between -ε and +ε

(11) But since x is in (a,b), we know that x - a is less than a - (a - δ) = δ so we can conclude that:

[f(x) - f(a)]/[g(x) - g(a)] - L is between -ε and +ε

(12) Let h(x) = [1 - f(a)/f(x)]/[1 - g(a)/g(x)]

(13) Now,

lim (x → a) h(x)*f(x)/g(x) = lim(x → a) ([1 - f(a)/f(x)]/[1 - g(a)/g(x)]*f(x)/g(x) =

= lim (x → a) =([f(x) - f(a)]/g(x)-g(a)]) = L

(14) So, it follows that for x in (a,b), we have:

h(x)*f(x)/g(x) - L is between -ε and +ε

(15) Since lim (x → a) f(x) = ∞ and lim (x → a) g(x) = ∞, we have:

lim (x → a) [1 - f(a)/f(x)] = 1 - 0 = 1

lim (x → a) [1 - g(a)/g(x)] = 1 - 0 = 1

(16) Using the Quotient Rule for Limits (see Lemma 7, here), we have:

lim (x → a) h(x) = (lim (x → a) [1 - f(a)/f(x)])/(lim (x → a)[1 - g(a)/g(x)]) = 1/1 = 1

(17) Using the Product Rule for Limits (see Lemma 2, here), we have:

lim (x → a) h(x)*f(x)/g(x) =lim (x → a) h(x) * lim (x → a) f(x)/g(x)

(18) This means that:

lim (x → a) f(x)/g(x) = [lim (x → a) h(x)*f(x)/g(x)]/[lim (x → a) h(x)] =

= L/1 = L

QED

Lemma 3: L'Hopital's Rule for 0/0 limits where f'(x)/g'(x) has an infinite limit

Let f(x),g(x) be two functions that are differentiable in a deleted neighborhood (b,a) such that, when b is less than x is less than a.

If:

lim(x → a) f(x) = 0 and lim(x → a) g(x) = 0 and lim(x → a) f'(x)/g'(x) = +∞ or -∞

Then:

lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x)

Proof:

(1) Let f(x),g(x) be continuous functions such that f(a)=0, g(a)=0

(2) Using Cauchy's Mean Value Theorem (see Theorem, here), for any x, there exists a point c such that x is less than c which is less than a and:

[f(a) - f(x)]g'(c) = [g(a)-g(x)]f'(c)

(3) We can rerrange the equation to give us:

f'(c)/g'(c) = [f(a) - f(x)]/[g(a) - g(x)]

(4) Since f(a)=0 and g(a) = 0, this gives us:

f'(c)/g'(c) = f(x)/g(x)

(5) Let L = lim(x → a+) f'(x)/g'(x)

We can assume that L is +∞. We could make the same argument with some adjustments if L is -∞

(6) Let ε be any positive real value.

(7) For an infinite limit, for ε, there exists a δ such that:

if (x - a) is between -δ and +δ, then f'(x) is between -1/ε and +1/ε

(8) Since c is between x and a, we can see that as x moves toward a so does c. This gives us:

lim(c → a+) f'(c)/g'(c) = +∞ since

if (c-a) is between -δ and +δ, then f'(c) is between -1/ε and +1/ε

(9) But from step #4, since f(x)/g(x) = f'(c)/g'(c), we can see that as x moves toward a, c likewise moves toward a and we have:

lim(x → a+) f(x)/g(x) = +∞ since:

for ε, there exists a δ such that:

if (x - a) is between δ and +δ, then (x-c) is also between -δ and +δ and since f(x)/g(x) = f'(c)/g'(c), we have that f(x)/g(x) is between -1/ε and +1/ε since f(c)/g'(c) is between -1/ε and +1/ε [See step #7]

QED

Lemma 4: L'Hopital's Rule for ∞/∞ limits where f'(x)/g'(x) has an infinite limit

Let f(x),g(x) be two functions that are differentiable in a deleted neighborhood (b,a) such that, when b is less than x is less than a.

If:

lim(x → a) f(x) = ∞ and lim(x → a) g(x) = ∞ and lim(x → a) f'(x)/g'(x) = +∞ or -∞

Then:

lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x)

Proof:

(1) Let L = lim (x → a) f'(x)/g'(x) where L is +∞

NOTE: We can use the same argument with some modifications if L is -∞

(2) Let ε be any positive value.

(3) From step #1, there exists δ such that if x -a is between -δ and δ, then f'(x) is between -1/ε and 1/ε

This is true since we are talking about an infinite limit where 1/ε can get as close to infinity as one wishes.

(4) Let b = a - δ

(5) Using Cauchy's Mean Value Theorem (see here), we know that there exists a value c such that c is in (a,b) and:

[f(a) - f(x)]g'(c) = [g(a)-g(x)]f'(c)

(6) Multiplying both sides by 1/([g'(c)][g(a) - g(x)]) gives us:

[f(a) - f(x)]/[g(a) - g(x)] = f'(c)/g'(c)

(7) Likewise, we can multiply (-1)/(-1) to both sides to get:

[f(x) - f(a)]/[g(x) - g(a)] = f'(c)/g'(c)

(8) Since c is in (a,b) and c moves toward a as x moves toward a, we have:

lim (c → a) f'(c)/g'(c) = lim (x → a) f'(x)/g'(x) = L

(9) But this means that:

lim (x → a) ([f(x) - f(a)]/[g(x) - g(a)]) = lim (c → a) f'(c)/g'(c) = L

(10) So using the definition of a limit we have:

If x - a is between -δ and +δ, then:

[f(x) - f(a)]/[g(x) - g(a)] is between -1/ε and +1/ε

(11) But since x is in (a,b), we know that x - a is less than a - (a - δ) = δ so we can conclude that:

[f(x) - f(a)]/[g(x) - g(a)] is between -1/ε and +1/ε

(12) Let h(x) = [1 - f(a)/f(x)]/[1 - g(a)/g(x)]

(13) Now,

lim (x → a) h(x)*f(x)/g(x) = lim(x → a) ([1 - f(a)/f(x)]/[1 - g(a)/g(x)]*f(x)/g(x) =

= lim (x → a) =([f(x) - f(a)]/g(x)-g(a)]) = L

(14) So, it follows that for x in (a,b), we have:

h(x)*f(x)/g(x) is between -1/ε and +1/ε

(15) Since lim (x → a) f(x) = ∞ and lim (x → a) g(x) = ∞, we have:

lim (x → a) [1 - f(a)/f(x)] = 1 - 0 = 1

lim (x → a) [1 - g(a)/g(x)] = 1 - 0 = 1

(16) Using the Quotient Rule for Limits (see Lemma 7, here), we have:

lim (x → a) h(x) = (lim (x → a) [1 - f(a)/f(x)])/(lim (x → a)[1 - g(a)/g(x)]) = 1/1 = 1

(17) Using the Product Rule for Limits (see Lemma 2, here), we have:

lim (x → a) h(x)*f(x)/g(x) =lim (x → a) h(x) * lim (x → a) f(x)/g(x)

(18) This means that:

lim (x → a) f(x)/g(x) = [lim (x → a) h(x)*f(x)/g(x)]/[lim (x → a) h(x)] =

= L/1 = L

QED

Theorem: L'Hopital's Rule

Let f(x),g(x) be two functions that are differentiable in a deleted neighborhood a such that g'(x) is nonzero in that neighborhood.

If one of the following conditions are true:

(a) lim(x → a) f(x) = 0 and lim(x → a) g(x) = 0

(b) lim(x → a) f(x) = ∞ and lim(x → a) g(x) = ∞

Then:

lim (x → a) f(x)/g(x) = lim (x → a) f'(x)/g'(x)

Proof:

(1) Now, we need to be able to handle the following four cases:

Case I: a is a real number/L is a real number
Case II: a is a real number/L is infinity
Case III: a is infinity/L is a real number
Case IV: a is infinty/L is infinity

(2) Case I is handled through Lemma 1 and Lemma 2.

(3) Case II is handled through Lemma 3 and Lemma 4.

(4) Assume that a is +∞

We can make the same arguments if a is -∞ with some modifications.

(5) There exists y such that 1/y = x.

(6) As x goes towards +∞, y goes toward +0.

(7) dx/dy = -1/y² (See Lemma 2, here)

(8) Using the Chain Rule (see Lemma 2, here)

f'(x) = d/dy[f(1/y)] = f'(1/y)*d/dy[1/y] = f'(1/y)*(-1/y²)

g'(x) = d/dy[g(1/y)] = g'(1/y)*d/dy[1/y] = g'(1/y)*(-1/y²)

(9) Thus, we have:

lim (x → +∞) [f'(x)/g'(x)] = lim (y → 0+) [f'(1/y)*(-1/y²)]/[g'(1/y)*(-1/y²)] =

= lim (y→ 0+) [f'(1/y)/g'(1/y)]

(10) Now, depending on f(x),g(x), f'(x)/g'(x), we can use Lemma 1, 2, 3, or 4 to establish:

lim (y → 0+) [f'(1/y)/g'(1/y)] = lim (y → 0+) f(1/y)/g(1/y)

(11) Since x=1/y, this gives us:

lim (x → +∞) [f'(x)/g'(x)] = lim(y → 0+) [f'(1/y)/g'(1/y)] = lim (y → 0+) f(1/y)/g(1/y) = lim (x → +∞) f(x)/g(x)

QED

References

L'Hopital's Rule, Ask Dr. Math
C. H. Edwards, Jr. and David E. Penney, Calculus and Analytic Geometry, Prentice Hall, 1990.

Sunday, August 27, 2006

Products of linear factors

Using the Fundamental Theorem of Algebra, we know that it is possible to express any equation of degree n with one variable as a product of linear factors.

In other words:

xⁿ + a₁x^n-1 + ... + a_n = (x - r₁)(x - r₂)*....*(x - r_n) where r_i represent the n roots for the equation.

In today's, blog, I show that we are not limited to this form.

Lemma 1:

if (r_i ≠ 0), then:

(x - r_i) = 0 if and only if (1 - x/r_i) = 0

Proof

(1) Assume that:

(x - r_i) = 0

(2) Then dividing both sides by (-r_i) gives us:

(1 - x/r_i) = 0

(3) Assume that:

(1 - x/r_i) = 0

(4) Multiply both sides by (-r_i) so that:

(x - r_i) = 0

QED

Corollary 1.1:

if (r_i² ≠ 0), then:

(x² - r_i²) = 0 if and only if (1 - x²/r_i²) = 0

Proof:

This follows directly from Lemma 1 above if set x' = x² and r' = r_i² then we have:

(x² - r_i²) = (1 - x²/r_i²) if and only if (x' - r') = (1 - x'/r')

QED

Monday, August 14, 2006

Euclid's Proof of the Infinitude of Primes

In today's blog, I will present a very well known proof. What makes this proof especially appealing is that it is not too complex. Even so, it is very powerful. This theorem was first presented in Euclid's Elements (Book IX, Proposition 20).

Theorem: There are an infinite number of primes.

Proof:

(1) Assume that there is only a finite number of primes.

(2) Then, there exists a prime p_n that is the largest prime.

(3) Let p₁, p₂, ..., p_n be the list of all primes that exist.

(4) Let x = p₁*p₂*...*p_n + 1.

(5) By the fundamental theorem of arithmetic (see Theorem 3, here), we know there is at least one prime that divides x. Let us call this prime p*.

(6) But none of the primes p₁ ... p_n divide x since x ≡ 1 (mod p_i) for any of the primes p₁ ... p_n

(7) Therefore, we have a contradiction. We have a prime p* that is not in the complete list of primes.

(8) So, we reject our assumption in step#1 and conclude that there are an infinite number of primes that exist.

QED

Sunday, July 16, 2006

Modular Arithmetic: Additional Lemma

Lemma 1: if a ≡ b (mod p), then ap^n-1 ≡ bp^n-1 (mod pⁿ)

Proof:

(1) a ≡ b (mod p)

(2) So, there exists c such that pc = a - b

(3) So that a = pc + b

(4) So, ap^n-1 ≡ (pc+b)p^n-1 ≡ pⁿc + bp^n-1 ≡ bp^n-1 (mod pⁿ)

QED

Subscribe to: Posts ( Atom )