Google Answers: Chicken and Egg: Calculus and the Binomial Theorem

View Question

Q: Chicken and Egg: Calculus and the Binomial Theorem ( Answered 5 out of 5 stars

Question

Subject: Chicken and Egg: Calculus and the Binomial Theorem
Category: Science > Math
Asked by: upstartaudio-ga
List Price: $25.00

Posted: 16 Dec 2004 18:07 PST
Expires: 15 Jan 2005 18:07 PST
Question ID: 443702

My question regards the general form of the binomial theorem where the exponent n is not an integer, sometimes called the Binomial Function. I've seen numerous proofs of properties of infinite series and equations for derivatives which require that the Binomial Function is already proved. But when I look into it, the proof of the general binomial theorem requires calculus. So the logic seems circular. Does anyone know of a proof of the binomial theorem that does not require calculus or infinite series that in turn depend on the binomial theorem? Alternatively, a development of calculus which begins with the antiderivative so that the Gamma function can be used to define the intermediate (non-integer exponents) of the binomial coefficient expansion. Obviously, estimating the binomial function with a taylor series immediately and directly depends on the polynomial rule, which in every proof I've seen relies on the validity of the general binomial function. Any ideas?
Request for Question Clarification by mathtalk-ga on 17 Dec 2004 06:39 PST Hi, upstartaudio-ga: To expand (x + h)^n when n is not a positive integer will require something more than a finite series, right? It can be expressed as an infinite series, which I take is what you mean by the Binomial Function or Binomial Theorem where "n is not an integer". However Taylor's Theorem with Remainder does not explicitly involve an infinite series, just a finite series (of indefinite length) plus a remainder term. It is basically a generalization of the Mean Value Theorem to higher order derivatives. So I think that a "proof" of Taylor series expansions (ie. their convergence) based on Taylor's Theorem with Remainder is not a circular argument in the sense you seem to suspect. It is true that "binomial" coefficients appear in Taylor's Theorem with Remainder, as one might expect with a "finite" expansion that establishes the convergence of an infinite series. Does this help? regards, mathtalk-ga
Clarification of Question by upstartaudio-ga on 17 Dec 2004 09:50 PST Quite right, I should have been more specific. I supposed that the infinite series was developed by extending the finite taylor series until the final term was smaller than some fixed error. But so much for my own lack of rigor, I am not a Mathematician so you can safely ignore everyting I say... I don't think it matters whether you prove the chicken (general binomial function without using calculus) or the egg (derivative rule for non-integer exponents without using the binomial function). I read somewhere that Euler had this difficulty and came up with his own inventive infinite series-based proof, but I'll be darned if I can find it. Because I am self=taught, I suppose this is just a question of my having studied things in the wrong order; I imagine that Mathematicians are taught the secret to resolving this (apparently) circular dependence between the derivative rule for non-integer exponents and the general binomial function (very many infinite series proofs also use the binomial function as well). So while it would be nice to understand the proof itself, I suppose my question is more of the nature does such an indepenedent proof exist? I'm currently reading a very thin, (and to me very difficult) book on infinite sequences and series by Konrad Knopp. Many examples are given without proof which can be demonstrated using a binomial expansion and vanishing remainder term. After the third such example, I remembered being flustered when I studied calculus many years ago, because at that time when I looked up a proof for the general binomial theorem, it depended on the derivative rule for non-integer exponents. I thought I'd ask the question here to see if anyone knows how Mathematicians get around this, or whether it remains unsolved. There is a beautiful inductive proof for the binomial expansion, but it begins with the case where n=1, so it is only valid for integer exponents. If there isn't actually a problem with using the two to prove each other, a clarification of my misaprehension would be both sufficient and appreciated. A proof of the binomial expansion for non-integer exponents which does not depend on the polynomial rule of calculus (particularly one I can understand) would be the best possible answer to this question. A proof of the polynomial rule for noninteger exponents that doesn't make use of the binomial expansion would also suffice. Finally if I'm asking for the proverbial horse of a different color, then then an explanation of why mathematicians don't worry about this would suffice.
Request for Question Clarification by mathtalk-ga on 17 Dec 2004 11:10 PST Thanks for the clarification. It seems that you are interested in a rigorous development of both the "binomial theorem" for non-integer exponents and a rule for differentiation: dx^r ---- = r * x^(r-1) when x > 0 [Power Rule] dx also for non-integer exponents, with special attention to showing that the reasoning is not circular. It is quite correct that many infinite series expansions, and in particular Taylor series expansions, depend in full generality upon a prior development of calculus topics like derivatives. Thus the Power Rule is the more elementary of these topics, but even so one needs to define what (if anything) is meant by a function f(x) = x^r when r is not an integer. If I've understood the Question correctly, I believe it can be answered by first providing a definition of the real-valued power function x^r and then proving the Power Rule outlined above without invoking any infinite binomial expansions. regards, mathtalk-ga
Clarification of Question by upstartaudio-ga on 18 Dec 2004 01:59 PST Yes, you've understood my question exactly.
Request for Question Clarification by mathtalk-ga on 19 Dec 2004 19:24 PST Hi, upstartaudio-ga: Please see the extended Comment I've posted below, which outlines the steps needed to develop the Power Rule from first principles and lays the foundation for what will most interest you, the noninteger exponent cases, by proving the integer exponent cases. Please review it and let me know if the level of exposition needs to be adjusted. We must next tackle the case of rational exponents, x^(p/q) for suitable integers p and q. This we will define as (x^p)^(1/q), ie. as: (f^-1)(x^p) where f(x) = x^q. Already in the Comment below we have established the existence, monotonicity, and continuity of this (f^-1) function, at least for q > 0. It remains only to show differentiability, and then we shall have the Power Rule for rational exponents. Last we will tackle the case of real exponents. To do so we invoke the density theorem, that every real number is the limit of a sequence of rational numbers. For example, the decimal expansion of a real number gives one such sequence of rational numbers that converge in that real number in the limit. A further extension is possible, to complex exponents. However the work needed to make that extension is slight in comparison to the details of the rational and real extensions. regards, mathtalk-ga
Clarification of Question by upstartaudio-ga on 19 Dec 2004 23:31 PST The comment below is quite clear. I think I see where this is leading now. From the integer case, I know how to (non-rigorously) find the derivative of the inverse: let y = x^(1/n) so that x = y^n. Then take the derivative with respect to y: d/dy y^n = d/dy x ny^(n-1) = dx/dy Find the derivative of the inverse: 1/(ny^n-1) = dy/dx 1/n * y^(1-n) = d/dx y then by substituting y = x^(1/n), I get d/dx x^(1/n) = 1/n * x^(1/n - 1). I had a feeling like my old calculus teacher was going to bean me with the chalk as I took the reciprocal of dx/dy... Are there some things I needed to establish first beyond requiring n <> 0? I think you've already established that the inverse function exists, and that it is continuous except at n=0. Do we need to do something else to know that it is differentiable?
Request for Question Clarification by mathtalk-ga on 20 Dec 2004 09:13 PST Hi, upstartaudio-ga: Your calculation is spot on. The "intuitive" approach: dx/dy = 1 / (dy/dx) suggested by Leibniz's notation for derivatives is valid, assuming that y(x) has an inverse function and that the (pointwise) derivative dy/dx has a reciprocal. For our purposes we will be fine with this, because we exclude x = 0. But the usual example of what can "go wrong" is: y = x^3 This function is continuous, strictly monotone increasing, and differentiable on the whole real line. The inverse function is also continuous and strictly monotone increasing on the whole real line, but fails to have a derivative at x = 0. Note that 1/(dy/dx) cannot be defined there. I'll describe the "rigorous" basis for this in my Answer, but if you want to read ahead, Google around for Inverse Function Theorem. regards, mathtalk-ga
Clarification of Question by upstartaudio-ga on 20 Dec 2004 11:42 PST This looks like the answer then. When I learned the power rule, the "proof" given used the binomial expansion to prove cases where n >= 2. While I feel that has instructional value owing to its simplicity, it doesn't provide much of a foundation for future work. In the ongoing debate of whether to teach calculus by infinitesimals (the 'old' way) or by limits (the 'new' way), I suppose that both have advantages. The former because it is very easy to explain the basic concepts. The latter because it provides a better foundation for further study, and because 99% of calculus courses are taught that way. I was once invited to give a semester lecture series at my old high school (which doesn't have an AP calculus program) to a handful of bright students who wanted some preparation before entering college. Since I use basic differential calculus in my work in electronics, I accepted the challenge and used Thompson & Gardner's "Calculus made easy" as the text. We met only once a week, but by the end of the semester we all knew how to find derivatives of most equations we were likely to run into, so as a practical course I suppose it was a success. But as a math course, I feel like I failed the students because they might find later work confusing; we spent almost no time at all on limits and differentiability (due to my own weakness in these areas). Perhaps they are better off having been exposed to the concepts, but perhaps they are worse off for having been deprived of a solid foundation for future work. The debate goes on. But for any of them that one day questions whether what they learned is true, it is comforting to know that they will likely find the answers they seek. Should the opportunity arise in the future, I think it would be better, as well as more intuitive, to develop the power rule from the product, quotient, and inverse function rules. Or possibly, prove the cases x^0, x^1, and x^2, introduce the power rule but delay the proof until we've covered the inverse function rule, then return and retrace the steps of your proof given below. Your final answer will become part of my instruction file, and I owe you a debt of gratitude for your precise effort in answering my question.

Answer

Subject: Re: Chicken and Egg: Calculus and the Binomial Theorem
Answered By: mathtalk-ga on 15 Jan 2005 18:05 PST
Rated: 5 out of 5 stars

Hi, upstartaudio-ga: Calculus is a difficult clase to master, either from the teacher's or the student's perspective. Its utility pushes it of necessity to the beginning of an undergraduate curriculum, but thorough understanding of rigorous foundations are ordinarily deferred to later. The generalized binomial theorem may be deduced as a special case of Taylor series, for powers x^r where r is other than a nonnegative integer. So a key development is the definition of such a function x^r for general exponents, satisfactory enough to use to deduce its derivatives. Because the power rule: d(x^r)/dx = r * x^(r-1) states the first derivative of x^r in simple terms of a constant times a similar function, establishing the form of the first derivative yields by induction the forms of all higher derivatives. The big distinction between nonnegative integer r and other exponents is that in that former case, the higher derivatives must eventually descend to zero, so that a Taylor series expansion (regardless of the choice of "center") will have finitely many terms if and only if the exponent r is a nonnegative integer. Here's an outline of the steps we will follow: 1. Recap of properties of function x^r and its inverse for integer r 2. Differentiability of the r'th root function (inverse of x^r) 3. Well-definedness of x^r when r = p/q is a ratio of nonzero integers 4. Limits of real exponents as Cauchy sequences of rational exponents regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 15 Jan 2005 22:37 PST 1. Previously functions f(x) = x^r and its inverse g(x) were shown for integer exponents r to have the following properties on the domain of positive real x: (i) The function f(x) = x^r is positive, and: strictly monotone increasing if r > 0, constantly 1 if r = 0, and strictly monotone decreasing if r < 0. (ii) The function f(x) = x^r is continuous, and provided r nonzero, g(x) is continuous (on the same domain). (iii) The function f(x) = x^r is differentiable, and: f'(x) = r * x^(r-1). This third property is precisely the power rule we seek to establish for all r, but so far we've only proved it for integers. Recall that we took an approach of proving the rule for positive r, then deducing the rule for negative r. The same technique can be applied when r is not an integer, leaving us free to concentrate on results for positive r (where the functions x^r will again be monotone increasing). The immediate open question is the differentiability of g(x), to which we next turn. 2. A simple version of the Inverse Function Theorem suffices for our needs. * * * * * * * * * * * * * * * * * * * * * * Thm. (Inverse Function Theorem) Suppose f:[a,b]->[c,d] is a real function with a positive derivative f' at each point of [a,b]. Then its inverse g:[c,d]->[a,b] exists and has a positive derivative g' as well (at each point of [c,d]), and the following is true: g'(f(x)) = 1/f'(x) for all x in [a,b]. Correspondingly: g'(y) = 1/f'(g(y)) for all y in [c,d]. * * * * * * * * * * * * * * * * * * * * * * Before we launch into the proof, let's see the implication of this for the function f(x) = x^r. For integer r > 0, f:[a,b]->[a^r,b^r] has a positive derivative at any point. Therefore its inverse does, and: g'(f(x)) = 1/f'(x) = 1/(r * x^(r-1)) and if we take x to be g(y), the r'th root of y, then: g'(y) = 1/(r * (g(y))(r-1)) If we were to allow ourselves to write g(y) = y^(1/r), that notation plus the laws of exponents would lead to (1/r)( y^((1/r)-1) ). However this formal acknowledgement of the power rule for y^(1/r) is a bit premature. For the moment we are content to claim only that our g is differentiable and has a positive derivative. * * * * * * * * * * * * * * * * * * * * * * Proof of Theorem: Since g(f(x)) = x, if we knew that g was differentiable, the desired consequence could be obtained by applying the chain. But since we want to prove g is differentiable, we should drill down the definition of g' as a limit: g(y+h) - g(y) g'(y) = limit --------------- h -> 0 h We can analyze this limit by introducing a sequence {h_i} of real numbers that tend to 0, and produce a corresponding sequence {x_i} by defining: x_i = g(y + h_i) for any fixed y in the domain. The endpoints y = c,d require slightly special handling, as the derivatives, etc. at these points involve one-sided limits. But these are adequately handled by taking the sequence {h_i} to approach zero from above for y = c, and from below for y = d, so that {x_i} approaches a from above (resp. b from below). Apart from this detail the arguments for the one-sided derivatives at the endpoints are the same as the two-sided limit argument we about to detail for the interior point cases. Since y + h_i = f(x_i) by definition of x_i and f(g(y)) = y, we can conclude that the limit can be rewritten: g(y+h_i) - g(y) x_i - x limit --------------- = limit ---------------- h_i -> 0 h_i x_i -> 0 f(x_i) - f(x) which we recognize to be the reciprocal of the limit defining f'(x): f(x_i) - f(x) f'(x) = limit --------------- x_i -> 0 x_i - x Thus, provided y = f(x), we know the latter positive limit guarantees the limit exists for its positive reciprocal, and the two limits are reciprocals: g'(y) = 1 / f'(x) = 1/f'(g(y)). QED * * * * * * * * * * * * * * * * * * * * * * This proves the Theorem and in turn suffices to establish the differentiability of g, the inverse of f(x) = x^r, at least for positive integers r (so that both f and g are increasing functions). It is therefore certain, by the Chain Rule for instance, that if we were to define x^r for some ratio r = p/q of two positive integers, that the composition: f_p( g_q(x) ) where f_p(x) = x^p and g_q(x) is the inverse of f_q(x) = x^q, the composition would be differentiable. In fact, bearing in mind that we assume p.q > 0, the composition would be a differentiable, monotone-increasing continuous function that is continuously (even differentiably) invertible by montone increasing: f_q( g_p(x) ) Before we allow ourselves the luxury of restating these as a form of the Power Rule, it is important to justify the notations: f_p( g_q(x) ) = x^(p/q) f_q( g_p(x) ) = x^(q/p) and the "rational exponent" extensions of the laws of exponents in particular. This then is the purpose of our next discussion. (to be continued -- mathtalk-ga)
Request for Answer Clarification by upstartaudio-ga on 16 Jan 2005 16:08 PST A final comment might be in order. We've shown the validity of the power rule without invoking the binomial theorem, and demonstrated that it holds for the rational and real cases as long as x >= 0. One of my textbooks ignores, while the other covers, the fact that x^r is not defined when x < 0, therefore the function isn't differentiable in that case. To prove it is undefined, we need only notice that if r = a/b, then it is also equal to (2a) / (2b) or (3a) / 3b). Now if b is odd, then x^r is an even root raised to a power, and if b is even, then x^r is an odd root raised to a power. So if x is negative, one of these cases isn't defined in the set of real numbers. The problem doesn't seem to go away with x complex, either, since there are infinitely many solutions if r is the limit of an irrational number. Even if we choose one of these roots by definition, we still have the ugly situation that the (qth root of x) quantity raised to the power p is not the same number as the qth root of x^p. So I don't see that we can get around restricting the domain of x to positive numbers. But that is the subject of another question...
Clarification of Answer by mathtalk-ga on 17 Jan 2005 21:27 PST 3. We have shown that given rational r = p/q where p,q are positive integers (so far), the function: f_p( g_q(x) ) where f_p(x) = x^p and g_q(x) is the inverse of x^q, has many properties (monotonicity, continuity, differentiability) in common with the powers x^r for integer r. But because there are multiple ways to express any rational r as a ratio of two integers, we need to show that all possible choices lead to the same function x^r. The mathematical shorthand says we need to show that: x^r = f_p( g_q(x) ) is well-defined, ie. that the formula's apparent dependence on the choice of p,q is only superficial. The key to this is purely algebraic, namely showing that the various functions f_p and g_q all commute, so that the order in which they are composed is immaterial to the final result. To begin with, the power functions f_p and f_q commute for any two positive integer powers by the associative law of multiplication and mathematical induction: f_p( f_q(x) ) = (x^q)^p = x^(pq) = (x^p)^q = f_q( f_p(x) ) Stated another way, this gives an unfamilar cast to a law of exponents: f_p o f_q = f_pq.= f_qp = f_q o f_p From this it can be deduced that the corresponding function inverses also commute, because where inverses exist for two functions, the inverse of their composition is the result of composing their inverses in the opposite order. For example: g_q( g_p( f_p( f_q(x) ) ) ) = x when the adjacent inverses "cancel" one another, which says that g_q( g_p(x) ) is the inverse of f_p( f_q(x) ). It follows then that g_q and g_p must commute, because f_p and f_q commute: g_q o g_p = (f_p o f_q)^-1 = (f_q o f_p)^-1 = g_p o g_q We further verify that f_p and g_q commute as well, so that it really doesn't matter whether we define x^r by composing them in one order or the other. Again we use the fact that f_p and f_q commute (with respect to function composition): f_q( f_p( g_q(x) ) ) = f_p( f_q( g_q(x) ) = f_p(x) Therefore after applying g_q to both sides: f_p( g_q(x) ) = g_q( f_p(x) ) which demonstrates f_p commutes with g_q. One point that we've been a bit cavalier about is that g_q is both a left and a right inverse for f_q. That is: f_q( g_q(x) ) = x = g_q( f_q(x) ) Whenever a function maps its domain 1-1 and onto itself, an inverse is two-sided. This symmetric outcome may easily be deduced from the symmetry of the relations: y = f_q(x) <==> x = g_q(y) where for any x there exists y to satisfy the condition, and conversely for any y there exists such an x. In any case these commutativity properties establish that x^r is well-defined, because if we take r = (cp)/(cq), for c any positive integer: f_cp( g_cq(x) ) = f_p( f_c( g_c( g_q(x) ) ) ) = f_p( g_q(x) ) This assures us that in the particular case p is divisible by q, so that r is an integer, our "new" definition of x^r fully agrees with the old definition f_r based solely on arithmetic. The more general point of these observations is that we aren't mislead by using the exponential notation x^r with rational r and rational s, because the familiar laws of exponents hold: (x^r)^s = x^(rs) (x^r)(x^s) = x^(r+s) (x^r)(y^r) = (xy)^r for any x,y > 0 and rational r,s. The proofs of all these are purely algebraic, and for that reason I will not go into more details. However we will finish this section with a bit of calculus, a derivation of the power rule for positive rational exponents, then using the quotient rule to extend it to negative rational exponents. Recall we have really only dealt with r > 0 in defining: x^r = f_p( g_q(x) ) when r = p/q and p,q > 0 are integers. The Chain Rule and the Inverse Function Theorem then give us: d(x^r)/dx = f'_p( g_q(x) ) * g'_q(x) = p * (g_q(x))^(p-1) * [1/f'_q( g_q(x) )] = p * x^((p-1)/q) * (1/q) * (1/x^((q-1)/q)) = (p/q) * x^((p-q)/q) = (p/q) * x^((p/q)-1) = r * x^(r-1) That is, we've shown the Power Rule for rational r > 0. Now we've already used in the computation above that dividing by x^r is equivalent to multiplying by x^-r, so it may be worth pointing out that the commitment to treat negative exponents as reciprocals is implied by the second law of exponents cited above, with s = -r: (x^r)(x^-r) = x^0 = 1 Therefore on the calculus side of things we need only apply the quotient rule to determine the derivative of x^-r: d(x^-r)/dx = d(1/(x^r))/dx d(x^r)/dx = - --------- (x^r)^2 = -r x^(r-1) * x^(-2r) = -r * x^(-r-1) With this we've also shown the Power Rule for rational r < 0. This is exactly the same calculation as we gave before on the integer exponents, but as acknowledged above, some algebraic preliminaries were necessary to assure they are sensible for the rational exponents. At last we come to our final step, extending our exponents to the general real case by taking limits of Cauchy sequences of rational exponents. It should not be surprising that we can show, once power functions x^a are "pinched" between monotone power functions x^r both above and below, that x^a must also be monotone, etc. (to be continued)
Clarification of Answer by mathtalk-ga on 25 Jan 2005 20:21 PST 4. A "construction" of the real numbers in mathematics is often based on Cauchy sequences of rational numbers. For every real number there is a sequence of rational numbers converging to it. For example, even though SQRT(2) or pi is irrational, their decimal expansions give us (by way of truncation) convergent sequences of rational numbers. If the last section was top-heavy with algebra, this one is top-heavy with analysis, ie. with fussing over limits and how to estimate sizes of things. So far we've defined power functions x^r for all rational exponents r and determined that their derivatives obey the Power Rule: d(x^r)/dx = r * x^(r-1) To extend our definition to real, irrational exponents a, we need to take the limit of x^r as r approaches a. In doing so we will make free use of the exponent notation and the usual "laws of exponents" for rational r, whose justification was sketched in the previous section. We state without proof the following: * * * * * * * * * * * * * * * * * * * Thm. (Completeness Property of the Real Numbers) Let {r_i} be a Cauchy sequence of real numbers. That is, for every epsilon > 0, there exists integer M > 0 such that for all i,j > M, \|r_i - r_j\| < epsilon. Then the sequence {r_i} converges to a real number. * * * * * * * * * * * * * * * * * * * The essential reason the real numbers have this property is because we "bake in" that property with their construction. The real numbers are the "completion" of the rationals with respect to the usual notion of distance between to numbers, the absolute value of the difference. So at any rate every Cauchy sequence of rationals has a unique limit in the real numbers, and the extension of this fact to Cauchy sequences of real numbers is the "completeness property". For our purposes we need to show that if {r_i} is a Cauchy sequence of rational numbers, then for any fixed real x > 0, {x^r_i} is a Cauchy sequence of real numbers. It suffices to have an estimate of \|x^r_i - x^r_j\| in terms of \|r_i - r_j\|. Intuitively, making the exponents close to one another puts the corresponding powers of x close to one another. x^r_i - x^r_j = [x^(r_i - r_j) - 1] * x^r_j Prop. 1 Let r > s be rational numbers and x > 0 be real. Then: i) if x > 1, then x^r > x^s ii) if x = 1, then x^r = x^s iii) if x < 1, then x^r < x^s Proof: This harkens back to something we showed earlier in the Comment. Certainly for any positive integer n, x > 1 if and only if x^n > 1. Restated conversely, x > 1 if and only if x^(1/n) > 1. First we show that if x > 1, then x^(r-s) > 1. Since r > s, we can express r - s with a common denominator as p/q where p,q are each positive integers. Then as just recalled: x > 1 ==> x^p > 1 ==> (x^p)^(1/q) > 1 ==> x^(r-s) = x^(p/q) > 1 which suffices upon multiplying both sides by x^s to show: x > 1 ==> x^r > x^s This proves part (i) of the Proposition. Part (ii) is trivial. Part (iii) follows from part (i) by applying it to 1/x, since taking reciprocals of positive numbers reverses the direction of an inequality. QED The result above establishes that for fixed x > 0, the values x^r vary monotonically with r, a nice counterpart to our earlier treatment of monotonicity in x for fixed r. One other result is needed: Prop. 2 Let x > 1 be a real number. Then i) {x^n: n = 1,2,3,...} increases without limit. ii) {x^(1/n): n = 1,2,3,...} converges to 1 Proof: (i) Clearly x > 1 implies: x < x^2 < x^3 < ... so the sequence in part (i) is strictly increasing. Therefore it either increases without limit (tend to +oo), or it must have as a limit a least upper bound (a fact which can be rigorously deduced from the Completeness Property of Real Numbers), say u. Since x > 1, u/x < u and therefore some integer n is such that: x^n > u/x But then x^(n+1) > u, contradicting that u was an upper bound. Thus the sequence increases without limit (tends to +oo). (ii) It is a little less obvious, but true, that the sequence: x > x^(1/2) > x^(1/3) > ... is monotone decreasing. Let m < n be two positive integers, so that assuming x > 1 still, x^m < x^n. Now the mn'th root function is monotone increasing so applying it to both sides of that inequality gives: x^(1/n) = (x^m)^(1/mn) < (x^n)^(1/mn) = x^(1/m) so x^(1/m) > x^(1/n) when m < n, as desired. Furthermore since x > 1, we know x^(1/n) > 1^(1/n) = 1, and thus 1 is a lower bound on the root sequence. It remains to show that 1 is a greatest lower bound and therefore the limit of the monotone decreasing sequence of roots. Suppose instead that b > 1 is also a lower bound of x^(1/n) for all positive integers n: b < x^(1/n) Now b^n < x for all integers n. In other words b > 1, but then sequence {b^n} has finite upper bound x, which contradicts part (i) of this proposition. So no such lower bound b > 1 exists, and the greatest lower bound of {x^(1/n)} is 1. As the sequence is monotonic, once the sequence is within epsilon > 0 of 1, it remains "within epsilon" of 1, so 1 is the limit to which the sequence converges. QED Before we reach for the climatic proof of the power rule for real exponents, let's first warm up by arguing that the laws of exponents continue to apply, and for that matter that a real power of x is well-defined by taking a limit on rational exponents: Thm. (Laws of Exponents, Real Powers) Let r > 0 be a real number, which is the limit of a sequence of positive rational numbers {r_i}. Then for any x > 0: f(x) = limit x^r_i i --> oo exists and is the same for any positive rational sequence {r_i} chosen. Moreover the laws of exponents hold for real powers r,s and positive real bases x,y: i) (x^r)^s = x^(rs) ii) (x^r)(x^s) = x^(r+s) iii) (x^r)(y^r) = (xy)^r Proof: It suffices to show the limit f(x) exists, to show that f(x) = x^r is well-defined, independent of the choice of rational sequence converging to r. For if two positive rational sequences both converge to r, we can combine them, interlacing them as odd and even entries into one sequence whose limit must then be common to both subseqences. In particular if r is actually a rational number, our "new" definition must secretly agree with the old one by virtue of considering the constant sequence r_i = r. We claim that {x^r_i} is a Cauchy sequence of real numbers, which is sufficient by the Completeness Property to show convergence. The logic is: (1) Since {r_i} converges to r, {r_i} is a Cauchy sequence. That is, given any epsilon > 0, there exists M such that for all i,j > M, \|r_i - r_j\| is always less than epsilon. (2) In Prop. 2 (ii) above we showed that for any x > 0, the sequence {x^(1/n)} converges to 1. So for fixed x we can specify N such that by the monotonicity shown in Prop. 1: rational s in (0,1/N) ==> \|x^s - 1\| < epsilon for any desired epsilon > 0. (3) Putting both facts together, for fixed x > 0, there exists for any epsilon > 0 an integer M such that for all i,j > M we have \|r_i - r_j\| less than some 1/N which guarantees: \|x^r_i - x^r_j\| < \|x^\|r_i - r_j\| - 1\| * min(x^r_i,x^r_j) < epsilon * C where C is an upper bound on {x^r_i}, say x^R where R is an upper bound on {r_i} if x > 1, or simply 1 if x <= 1. Since epsilon can be as small as we please, this shows the sequence {x^r_i} is Cauchy, and thus convergent. Once we have the definition of x^r as a limit from the rational exponent cases, the laws of exponents (i)-(iii) follow easily. Let us show the third of these in some detail: (x^r)(y^r) = ( limit x^r_i ) ( limit y^r_i ) i --> oo i --> oo = limit (x^r_i)(y^r_i) i --> oo = limit (xy)^r_i i --> oo = (xy)^r where we've used only that a product of two limits which exists is the limit of corresponding products, together with the previously established law of exponents for the rational case. Parts (i) and (ii) are similar. QED Thm. (Power Rule for Positive Real Exponents) Let r > 0 be a real number. Then f(x) = x^r is a continuous, monotone increasing function from positive real numbers to positive real numbers with inverse g(x) = x^(1/r). Also f is differentiable, and: f'(x) = r * x^(r-1) Proof: Having developed all the "machinery" above, it is now straightforward to prove the power rule continues to hold for positive real exponents. Of course if r is rational, we are already done. So let's assume r is irrational. One way to show f(x) = x^r is continuous and increasing is to jump right into show that it is differentiable with positive derivative. For example the laws of exponents allow us to reduce the question of the derivative of f'(x) for general x to that of the derivative at x = 1: (x + h)^r - x^r f'(x) = limit ----------------- h --> 0 h (1 + h/x)^r - 1 = limit ----------------- * x^(r-1) h --> 0 h/x = f'(1) * x^(r-1) This simplification isn't essential, as the way we are about to show f'(1) = r would really work for any argument x, but it will make the notation and (hopefully) the presentation clearer. Recalling the monotonicity properties of Prop. 1, it should be evident that for rational sequences {r_i} converging to r from above and {s_i} converging to r from below, we have: for all i, x > 1 implies x^s_i < x^r < x^r_i x = 1 implies x^s_i = x^r = x^r_i x < 1 implies x^s_i > x^r > x^r_i In other words the graph of f(x) = x^r is "pinched" between the family of curves x^s_i and x^r_i. Since their curves are strictly monotone increasing, the curve x^r must also be increasing at 1. In particular since for h > 0: (1 + h)^s_i < (1 + h)^r < (1 + h)^r_i (1 - h)^s_i > (1 - h)^r > (1 - h)^r_i we can "squeeze" the limits of the difference quotients: (1+h)^r - 1 f'(1) = limit ------------- = r h --> 0 h because both "side" limits as i --> oo agree: (1+h)^r_i - 1 limit ( limit --------------- ) = limit r_i = r i --> oo h --> 0 h i --> oo (1+h)^s_i - 1 limit ( limit --------------- ) = limit s_i = r i --> oo h --> 0 h i --> oo Therefore in general f'(x) = f'(1) * x^(r-1) = r * x^(r-1). The demonstration that g(x) = x^(1/r) is the inverse function to f(x) = x^r is an even more immediate application of the laws of exponents, namely part (i) of the preceding Theorem: g(f(x)) = (x^r)^(1/r) = x^(r * 1/r) = x^1 = x QED We finish up by filling in the gap for negative exponents. Corollary (Power Rule for All Real Exponents) If we extend the definition f(x) = x^r to r < 0 by allowing the limit of a general sequence of rational numbers r_i --> r, then the power rule and other properties continue to hold, the only difference worth mentioning is that when r < 0, f(x) is monotone decreasing. Proof: Since the limit of reciprocals of a sequence converging to a nonzero limit is the reciprocal of that limit, the result of defining: f(x) = limit x^r_i i --> oo for a sequence of negative rational numbers converging to r < 0 is the same as: limit x^r_i = limit x^-\|r_i\| i --> oo i --> oo = 1 / limit x^\|r_i\| i --> oo = 1 / x^\|r\| so as before we can take the derivative of f(x) by applying the simplified quotient rule: f'(x) = -\|r\| * x^(\|r\|-1) / x^\|2r\| = r * x^(\|r\| - 2\|r\| - 1) = r * x^(-\|r\| - 1) = r * x^(r-1) QED

upstartaudio-ga rated this answer: 5 out of 5 stars

and gave an additional tip of: $25.00

Thank you for the insightful and thorough answer to my question.

Comments

Subject: Re: Chicken and Egg: Calculus and the Binomial Theorem
From: mathtalk-ga on 19 Dec 2004 19:04 PST

To prove the well-known rule of derivatives:

d(x^r)/dx = r * x^(r-1)

we must define a function f(x) = x^r on a suitable domain
and establish certain facts about it.

For example, what do we mean by 4^(1/2)?  While 4 has two
square roots, 2 and -2, we would need to define f(x) in a
way that assigns only one value.  A standard choice would
be to use the positive root, but if we ask about negative
values of x, it's unclear that any "standard choice" can
be made for fractional r.  Also, for negative exponents r,
there is even a problem with x = 0.

So let's define f on the domain of positive real numbers.
Note that x = 0 is excluded.

The outline of our development is to define and prove the
properties of f(x) = x^r for successively larger sets of
exponents r:

Case 1: r an integer
Case 2: r is rational
Case 3: r is real

The properties we want to establish are these:

(i)   The function f(x) = x^r is positive, and:

        strictly monotone increasing if r > 0,
        constant if r = 0, and
        strictly monotone decreasing if r < 0.

(ii)  The function f(x) = x^r is continuous and unless
      r = 0 has a continuous inverse on the same domain.

(iii) The function f(x) = x^r is differentiable, and:

        f'(x) = r * x^(r-1).

We shall not fill in every detail, but aim to give as much
as necessary to make clear that no circular reasoning is
involved.  I'll emphasize details involving derivatives or
proving differentiability.  Basic results about limits and
continuity will be assumed or at the least given a deferred
treatment.  In particular I think it's fairly evident that
typical epsilon/delta arguments do not depend on appeals to
the Power Rule or the Binomial Theorem.

Case 1:  f(x) = x^r for integer exponents r
===========================================

Our main tool for these exponents is proof by induction
to handle all the nonnegative integers.  Negative values
of r will then be treated as reciprocals of the positive
cases.

Basis cases
-----------

For r = 0, define f(x) = x^0 as the constant function 1.
This positive function is continuous, with derivative:

  f'(x) = 0 = 0 * x^(0-1)

For r = 1, we have f(x) = x.  It is positive because of
the domain restriction, strictly increasing, continuous,
equal to its own inverse, and differentiable, with:

  f'(x) = 1 = 1 * x^(1-1)

Now define f(x) for larger exponents r inductively using:

  x^(r+1) = x * (x^r)

Note that the basis steps for (i)-(iii) are dealt with,
so we have only the induction steps to do in each part.

Induction part (i)
------------------

For (i) the induction step is to use prior case x^r > 0
and x > 0 to conclude:

  x^(r+1) = x * (x^r) > 0
  
ie. the product of positive numbers is again positive.

We also prove that f(x) = x^(r+1) is strictly monotone
increasing.  Suppose that x > y > 0.  By the induction
hypothesis x^r > y^r.  Then:

  x^(r+1) = x * (x^r) > x * (y^r) > y * (y^r) = y^(r+1)

Induction part (ii)
-------------------

For (ii) the induction step requires a lemma that the
product of two continuous real functions is continuous,
which I can supply if desired.  It depends on knowing
that a limit of a product exists when each factor has
a limit, a fact we will also need for the derivatives.

There are some fine details in showing f(x) = x^(r+1)
has a continuous inverse, and if we were not hurrying
on to noninteger exponents, I would linger over them.

The strict monotonicity of f(x) implies that it is 1-1.
We must also show that f(x) is onto the positive real
numbers.  Let z > 0 be a real number; either z < 1, or
z = 1, or z > 1.  Now f(1) = 1 by induction, and the
cases z < 1 and z > 1 are symmetric, so I'll do one of
them and leave the other as an exercise.

Suppose z > 1.  Then z^(r+1) > z > 1 because z^r > 1,
and in other words f(z) > z > f(1).  The Intermediate
Value Theorem then implies there exists x between z
and 1 such that f(x) = z.  The case z < 1 would be
argued similarly, reversing the inequality directions
as necessary.  Together these establish that f(x) is
onto and thus has a functional inverse (f^-1)(x).

A further observation is that strict montonicity of
(f^-1) follows from that of f, because f(x) > f(y)
is only consistent with x > y (ie. x=y or x < y lead
to a contradiction).

Then continuity of (f^-1) follows easily enough. Let
epsilon > 0 be given, small enough so that interval
(x - epsilon,x + epsilon) contains only positive real
numbers around some fixed x > 0.  Then:

  I = (f(x - epsilon),f(x + epsilon))

is an open interval containing f(x).  Hence delta > 0
exists such that (f(x) - delta, f(x) + delta) is 
contained in I, and by montonicity:

  0 < | y - f(x) | < delta 

implies:

  | (f^-1)(y) - x | < epsilon

This shows that (f^-1) is continuous at f(x), which is
a typical point in its domain (of positive real numbers).

Induction part (iii)
--------------------

For doing our induction on (iii), we need as a lemma
the product rule of differentiation:

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Lemma  Let F(x) = G(x)*H(x) and G,H be differentiable.
-----

Then F is differentiable and:

  F'(x) = G(x)*H'(x) + G'(x)*H(x)

Proof:
------

By definition of the derivative:

  F'(x) =   lim   (F(x+h) - F(x))/h
           h -> 0

provided this limit can be shown to exist.

A standard trick of adding and subtracting the
same term:

  F(x+h) - F(x) = G(x+h)*H(x+h) - G(x)*H(x)

                =  G(x+h)*H(x+h) - G(x+h)*H(x)
                
                  + G(x+h)*H(x) - G(x)*H(x)

tells us that:

 F(x+h) - F(x)     G(x+h)*H(x+h) - G(x+h)*H(x)
 -------------  =  ---------------------------
       h                        h
       
                     G(x+h)*H(x) - G(x)*H(x)
                   + -----------------------
                                h

The left hand side will have a limit as h tends
to zero if both terms on the right have limits
as h tends to zero, and the limit of the left
hand side will be the sum of the two respective
limits of terms on the right.

The limit of the first of these terms is this:

  lim   ( G(x+h)*H(x+h) - G(x+h)*H(x) )/h
 h -> 0

   =  lim    G(x+h)  *  (H(x+h) - H(x))/h
     h -> 0

Here we use that standard result, that the limit of a
product is equal to the product of the limits of its
factors.  The first factor has a limit:

  lim   G(x+h)  =  G(x)
 h -> 0

because G is continuous at x.  The second factor has
a limit because H differentiable means:

  lim   (H(x+h) - H(x))/h  =  H'(x)
 h -> 0 

Therefore the first term on the right hand side above
tends to G(x)*H'(x) as h goes to 0.

The second term on the right hand side is actually
easier, as the common factor H(x) is constant with
respect to the limit on h, and thus it tends to
a limit of G'(x)*H(x) as h goes to 0.

Combining these two limits gives that:

  lim   (F(x+h) - F(x))/h
 h -> 0

exists and equals G(x)*H'(x) + G'(x)*H(x).  QED

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Now we apply the product rule lemma we just proved
to show that derivative exists (and the power rule
holds) for positive integer r by induction:

  f(x) = x^(r+1) = x * x^r

  f'(x) =  x * (r * x^(r-1)) + 1*(x^r)

        =  r * x^r  +  x^r
        
        = (r+1) * x^r


Negative integer exponents
--------------------------

Having established the cases of r = 0 and positive
exponents r, there remain the cases r < 0:

  x^(-r) =  1/(x^r)

Obviously the reciprocal of a positive number is also
positive.  Furthermore if x > y > 0, where previously
we showed:

  x^|r| > y^|r|

it now follows (for r < 0):

  x^r = 1/x^|r| < 1/y^|r| = y^r

so that f(x) is strictly monotone decreasing.

The reciprocal of a _nonzero_ continuous function is
continuous, so that part of (ii) holds for r < 0 too,
implied by a composition of continuous functions being
continuous.  That is, the function 1/x is continous.

Furthermore 1/x is its own (continuous) inverse, so
that the inverse of f(x) = x^r is continuous as well.

Finally we show differentiability of f(x) = x^r and
that it satisfies the power rule.  One can consider
f(x) as the composition of two functions, x^|r| and
1/x, and apply the chain rule. Alternatively we can
prove and apply a simplified quotient rule:

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Lemma  Let F(x) = 1/H(x) with differentiable H > 0.
----- 

Then F is differentiable and:

  F'(x)  =  - H'(x)/(H(x))^2

Proof:
------

We can be a little more concise in presenting this
proof.  Again rewrite the limit of the difference
quotient that defines the derivative F'(x) until we
get the desired result:

                  (1/H(x+h)) - (1/H(x))
  F'(x)  =  lim   ---------------------
           h -> 0           h

                   H(x) - H(x+h)          1
         =  lim   --------------- * -------------
           h -> 0        h           H(x)*H(x+h)


         =     - H'(x) * 1/(H(x))^2

QED

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

[Note that this simplified quotient rule, having only
a constant numerator, will combine with the product rule
proved earlier to give the full quotient rule.]

Now apply this to a case of negative integer exponent r:

  f(x) = 1/x^|r|,  where r = -|r|
  
  f'(x) = -|r| * x^(|r|-1) / (x^|r|)^2
  
        = -|r| * x^(|r| - 1 - 2|r|)
        
        = -|r| * x^(-|r| - 1)
        
        = r * x^(r-1)

This completes the treatment of the integer exponents.

regards, mathtalk-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy

Hi, upstartaudio-ga: Calculus is a difficult clase to master, either from the teacher's or the student's perspective. Its utility pushes it of necessity to the beginning of an undergraduate curriculum, but thorough understanding of rigorous foundations are ordinarily deferred to later. The generalized binomial theorem may be deduced as a special case of Taylor series, for powers x^r where r is other than a nonnegative integer. So a key development is the definition of such a function x^r for general exponents, satisfactory enough to use to deduce its derivatives. Because the power rule: d(x^r)/dx = r * x^(r-1) states the first derivative of x^r in simple terms of a constant times a similar function, establishing the form of the first derivative yields by induction the forms of all higher derivatives. The big distinction between nonnegative integer r and other exponents is that in that former case, the higher derivatives must eventually descend to zero, so that a Taylor series expansion (regardless of the choice of "center") will have finitely many terms if and only if the exponent r is a nonnegative integer. Here's an outline of the steps we will follow: 1. Recap of properties of function x^r and its inverse for integer r 2. Differentiability of the r'th root function (inverse of x^r) 3. Well-definedness of x^r when r = p/q is a ratio of nonzero integers 4. Limits of real exponents as Cauchy sequences of rational exponents regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 15 Jan 2005 22:37 PST 1. Previously functions f(x) = x^r and its inverse g(x) were shown for integer exponents r to have the following properties on the domain of positive real x: (i) The function f(x) = x^r is positive, and: strictly monotone increasing if r > 0, constantly 1 if r = 0, and strictly monotone decreasing if r < 0. (ii) The function f(x) = x^r is continuous, and provided r nonzero, g(x) is continuous (on the same domain). (iii) The function f(x) = x^r is differentiable, and: f'(x) = r * x^(r-1). This third property is precisely the power rule we seek to establish for all r, but so far we've only proved it for integers. Recall that we took an approach of proving the rule for positive r, then deducing the rule for negative r. The same technique can be applied when r is not an integer, leaving us free to concentrate on results for positive r (where the functions x^r will again be monotone increasing). The immediate open question is the differentiability of g(x), to which we next turn. 2. A simple version of the Inverse Function Theorem suffices for our needs. * * * * * * * * * * * * * * * * * * * * * * Thm. (Inverse Function Theorem) Suppose f:[a,b]->[c,d] is a real function with a positive derivative f' at each point of [a,b]. Then its inverse g:[c,d]->[a,b] exists and has a positive derivative g' as well (at each point of [c,d]), and the following is true: g'(f(x)) = 1/f'(x) for all x in [a,b]. Correspondingly: g'(y) = 1/f'(g(y)) for all y in [c,d]. * * * * * * * * * * * * * * * * * * * * * * Before we launch into the proof, let's see the implication of this for the function f(x) = x^r. For integer r > 0, f:[a,b]->[a^r,b^r] has a positive derivative at any point. Therefore its inverse does, and: g'(f(x)) = 1/f'(x) = 1/(r * x^(r-1)) and if we take x to be g(y), the r'th root of y, then: g'(y) = 1/(r * (g(y))(r-1)) If we were to allow ourselves to write g(y) = y^(1/r), that notation plus the laws of exponents would lead to (1/r)( y^((1/r)-1) ). However this formal acknowledgement of the power rule for y^(1/r) is a bit premature. For the moment we are content to claim only that our g is differentiable and has a positive derivative. * * * * * * * * * * * * * * * * * * * * * * Proof of Theorem: Since g(f(x)) = x, if we knew that g was differentiable, the desired consequence could be obtained by applying the chain. But since we want to prove g is differentiable, we should drill down the definition of g' as a limit: g(y+h) - g(y) g'(y) = limit --------------- h -> 0 h We can analyze this limit by introducing a sequence {h_i} of real numbers that tend to 0, and produce a corresponding sequence {x_i} by defining: x_i = g(y + h_i) for any fixed y in the domain. The endpoints y = c,d require slightly special handling, as the derivatives, etc. at these points involve one-sided limits. But these are adequately handled by taking the sequence {h_i} to approach zero from above for y = c, and from below for y = d, so that {x_i} approaches a from above (resp. b from below). Apart from this detail the arguments for the one-sided derivatives at the endpoints are the same as the two-sided limit argument we about to detail for the interior point cases. Since y + h_i = f(x_i) by definition of x_i and f(g(y)) = y, we can conclude that the limit can be rewritten: g(y+h_i) - g(y) x_i - x limit --------------- = limit ---------------- h_i -> 0 h_i x_i -> 0 f(x_i) - f(x) which we recognize to be the reciprocal of the limit defining f'(x): f(x_i) - f(x) f'(x) = limit --------------- x_i -> 0 x_i - x Thus, provided y = f(x), we know the latter positive limit guarantees the limit exists for its positive reciprocal, and the two limits are reciprocals: g'(y) = 1 / f'(x) = 1/f'(g(y)). QED * * * * * * * * * * * * * * * * * * * * * * This proves the Theorem and in turn suffices to establish the differentiability of g, the inverse of f(x) = x^r, at least for positive integers r (so that both f and g are increasing functions). It is therefore certain, by the Chain Rule for instance, that if we were to define x^r for some ratio r = p/q of two positive integers, that the composition: f_p( g_q(x) ) where f_p(x) = x^p and g_q(x) is the inverse of f_q(x) = x^q, the composition would be differentiable. In fact, bearing in mind that we assume p.q > 0, the composition would be a differentiable, monotone-increasing continuous function that is continuously (even differentiably) invertible by montone increasing: f_q( g_p(x) ) Before we allow ourselves the luxury of restating these as a form of the Power Rule, it is important to justify the notations: f_p( g_q(x) ) = x^(p/q) f_q( g_p(x) ) = x^(q/p) and the "rational exponent" extensions of the laws of exponents in particular. This then is the purpose of our next discussion. (to be continued -- mathtalk-ga)
Request for Answer Clarification by upstartaudio-ga on 16 Jan 2005 16:08 PST A final comment might be in order. We've shown the validity of the power rule without invoking the binomial theorem, and demonstrated that it holds for the rational and real cases as long as x >= 0. One of my textbooks ignores, while the other covers, the fact that x^r is not defined when x < 0, therefore the function isn't differentiable in that case. To prove it is undefined, we need only notice that if r = a/b, then it is also equal to (2a) / (2b) or (3a) / 3b). Now if b is odd, then x^r is an even root raised to a power, and if b is even, then x^r is an odd root raised to a power. So if x is negative, one of these cases isn't defined in the set of real numbers. The problem doesn't seem to go away with x complex, either, since there are infinitely many solutions if r is the limit of an irrational number. Even if we choose one of these roots by definition, we still have the ugly situation that the (qth root of x) quantity raised to the power p is not the same number as the qth root of x^p. So I don't see that we can get around restricting the domain of x to positive numbers. But that is the subject of another question...
Clarification of Answer by mathtalk-ga on 17 Jan 2005 21:27 PST 3. We have shown that given rational r = p/q where p,q are positive integers (so far), the function: f_p( g_q(x) ) where f_p(x) = x^p and g_q(x) is the inverse of x^q, has many properties (monotonicity, continuity, differentiability) in common with the powers x^r for integer r. But because there are multiple ways to express any rational r as a ratio of two integers, we need to show that all possible choices lead to the same function x^r. The mathematical shorthand says we need to show that: x^r = f_p( g_q(x) ) is well-defined, ie. that the formula's apparent dependence on the choice of p,q is only superficial. The key to this is purely algebraic, namely showing that the various functions f_p and g_q all commute, so that the order in which they are composed is immaterial to the final result. To begin with, the power functions f_p and f_q commute for any two positive integer powers by the associative law of multiplication and mathematical induction: f_p( f_q(x) ) = (x^q)^p = x^(pq) = (x^p)^q = f_q( f_p(x) ) Stated another way, this gives an unfamilar cast to a law of exponents: f_p o f_q = f_pq.= f_qp = f_q o f_p From this it can be deduced that the corresponding function inverses also commute, because where inverses exist for two functions, the inverse of their composition is the result of composing their inverses in the opposite order. For example: g_q( g_p( f_p( f_q(x) ) ) ) = x when the adjacent inverses "cancel" one another, which says that g_q( g_p(x) ) is the inverse of f_p( f_q(x) ). It follows then that g_q and g_p must commute, because f_p and f_q commute: g_q o g_p = (f_p o f_q)^-1 = (f_q o f_p)^-1 = g_p o g_q We further verify that f_p and g_q commute as well, so that it really doesn't matter whether we define x^r by composing them in one order or the other. Again we use the fact that f_p and f_q commute (with respect to function composition): f_q( f_p( g_q(x) ) ) = f_p( f_q( g_q(x) ) = f_p(x) Therefore after applying g_q to both sides: f_p( g_q(x) ) = g_q( f_p(x) ) which demonstrates f_p commutes with g_q. One point that we've been a bit cavalier about is that g_q is both a left and a right inverse for f_q. That is: f_q( g_q(x) ) = x = g_q( f_q(x) ) Whenever a function maps its domain 1-1 and onto itself, an inverse is two-sided. This symmetric outcome may easily be deduced from the symmetry of the relations: y = f_q(x) <==> x = g_q(y) where for any x there exists y to satisfy the condition, and conversely for any y there exists such an x. In any case these commutativity properties establish that x^r is well-defined, because if we take r = (cp)/(cq), for c any positive integer: f_cp( g_cq(x) ) = f_p( f_c( g_c( g_q(x) ) ) ) = f_p( g_q(x) ) This assures us that in the particular case p is divisible by q, so that r is an integer, our "new" definition of x^r fully agrees with the old definition f_r based solely on arithmetic. The more general point of these observations is that we aren't mislead by using the exponential notation x^r with rational r and rational s, because the familiar laws of exponents hold: (x^r)^s = x^(rs) (x^r)(x^s) = x^(r+s) (x^r)(y^r) = (xy)^r for any x,y > 0 and rational r,s. The proofs of all these are purely algebraic, and for that reason I will not go into more details. However we will finish this section with a bit of calculus, a derivation of the power rule for positive rational exponents, then using the quotient rule to extend it to negative rational exponents. Recall we have really only dealt with r > 0 in defining: x^r = f_p( g_q(x) ) when r = p/q and p,q > 0 are integers. The Chain Rule and the Inverse Function Theorem then give us: d(x^r)/dx = f'_p( g_q(x) ) * g'_q(x) = p * (g_q(x))^(p-1) * [1/f'_q( g_q(x) )] = p * x^((p-1)/q) * (1/q) * (1/x^((q-1)/q)) = (p/q) * x^((p-q)/q) = (p/q) * x^((p/q)-1) = r * x^(r-1) That is, we've shown the Power Rule for rational r > 0. Now we've already used in the computation above that dividing by x^r is equivalent to multiplying by x^-r, so it may be worth pointing out that the commitment to treat negative exponents as reciprocals is implied by the second law of exponents cited above, with s = -r: (x^r)(x^-r) = x^0 = 1 Therefore on the calculus side of things we need only apply the quotient rule to determine the derivative of x^-r: d(x^-r)/dx = d(1/(x^r))/dx d(x^r)/dx = - --------- (x^r)^2 = -r x^(r-1) * x^(-2r) = -r * x^(-r-1) With this we've also shown the Power Rule for rational r < 0. This is exactly the same calculation as we gave before on the integer exponents, but as acknowledged above, some algebraic preliminaries were necessary to assure they are sensible for the rational exponents. At last we come to our final step, extending our exponents to the general real case by taking limits of Cauchy sequences of rational exponents. It should not be surprising that we can show, once power functions x^a are "pinched" between monotone power functions x^r both above and below, that x^a must also be monotone, etc. (to be continued)
Clarification of Answer by mathtalk-ga on 25 Jan 2005 20:21 PST 4. A "construction" of the real numbers in mathematics is often based on Cauchy sequences of rational numbers. For every real number there is a sequence of rational numbers converging to it. For example, even though SQRT(2) or pi is irrational, their decimal expansions give us (by way of truncation) convergent sequences of rational numbers. If the last section was top-heavy with algebra, this one is top-heavy with analysis, ie. with fussing over limits and how to estimate sizes of things. So far we've defined power functions x^r for all rational exponents r and determined that their derivatives obey the Power Rule: d(x^r)/dx = r * x^(r-1) To extend our definition to real, irrational exponents a, we need to take the limit of x^r as r approaches a. In doing so we will make free use of the exponent notation and the usual "laws of exponents" for rational r, whose justification was sketched in the previous section. We state without proof the following: * * * * * * * * * * * * * * * * * * * Thm. (Completeness Property of the Real Numbers) Let {r_i} be a Cauchy sequence of real numbers. That is, for every epsilon > 0, there exists integer M > 0 such that for all i,j > M, \|r_i - r_j\| < epsilon. Then the sequence {r_i} converges to a real number. * * * * * * * * * * * * * * * * * * * The essential reason the real numbers have this property is because we "bake in" that property with their construction. The real numbers are the "completion" of the rationals with respect to the usual notion of distance between to numbers, the absolute value of the difference. So at any rate every Cauchy sequence of rationals has a unique limit in the real numbers, and the extension of this fact to Cauchy sequences of real numbers is the "completeness property". For our purposes we need to show that if {r_i} is a Cauchy sequence of rational numbers, then for any fixed real x > 0, {x^r_i} is a Cauchy sequence of real numbers. It suffices to have an estimate of \|x^r_i - x^r_j\| in terms of \|r_i - r_j\|. Intuitively, making the exponents close to one another puts the corresponding powers of x close to one another. x^r_i - x^r_j = [x^(r_i - r_j) - 1] * x^r_j Prop. 1 Let r > s be rational numbers and x > 0 be real. Then: i) if x > 1, then x^r > x^s ii) if x = 1, then x^r = x^s iii) if x < 1, then x^r < x^s Proof: This harkens back to something we showed earlier in the Comment. Certainly for any positive integer n, x > 1 if and only if x^n > 1. Restated conversely, x > 1 if and only if x^(1/n) > 1. First we show that if x > 1, then x^(r-s) > 1. Since r > s, we can express r - s with a common denominator as p/q where p,q are each positive integers. Then as just recalled: x > 1 ==> x^p > 1 ==> (x^p)^(1/q) > 1 ==> x^(r-s) = x^(p/q) > 1 which suffices upon multiplying both sides by x^s to show: x > 1 ==> x^r > x^s This proves part (i) of the Proposition. Part (ii) is trivial. Part (iii) follows from part (i) by applying it to 1/x, since taking reciprocals of positive numbers reverses the direction of an inequality. QED The result above establishes that for fixed x > 0, the values x^r vary monotonically with r, a nice counterpart to our earlier treatment of monotonicity in x for fixed r. One other result is needed: Prop. 2 Let x > 1 be a real number. Then i) {x^n: n = 1,2,3,...} increases without limit. ii) {x^(1/n): n = 1,2,3,...} converges to 1 Proof: (i) Clearly x > 1 implies: x < x^2 < x^3 < ... so the sequence in part (i) is strictly increasing. Therefore it either increases without limit (tend to +oo), or it must have as a limit a least upper bound (a fact which can be rigorously deduced from the Completeness Property of Real Numbers), say u. Since x > 1, u/x < u and therefore some integer n is such that: x^n > u/x But then x^(n+1) > u, contradicting that u was an upper bound. Thus the sequence increases without limit (tends to +oo). (ii) It is a little less obvious, but true, that the sequence: x > x^(1/2) > x^(1/3) > ... is monotone decreasing. Let m < n be two positive integers, so that assuming x > 1 still, x^m < x^n. Now the mn'th root function is monotone increasing so applying it to both sides of that inequality gives: x^(1/n) = (x^m)^(1/mn) < (x^n)^(1/mn) = x^(1/m) so x^(1/m) > x^(1/n) when m < n, as desired. Furthermore since x > 1, we know x^(1/n) > 1^(1/n) = 1, and thus 1 is a lower bound on the root sequence. It remains to show that 1 is a greatest lower bound and therefore the limit of the monotone decreasing sequence of roots. Suppose instead that b > 1 is also a lower bound of x^(1/n) for all positive integers n: b < x^(1/n) Now b^n < x for all integers n. In other words b > 1, but then sequence {b^n} has finite upper bound x, which contradicts part (i) of this proposition. So no such lower bound b > 1 exists, and the greatest lower bound of {x^(1/n)} is 1. As the sequence is monotonic, once the sequence is within epsilon > 0 of 1, it remains "within epsilon" of 1, so 1 is the limit to which the sequence converges. QED Before we reach for the climatic proof of the power rule for real exponents, let's first warm up by arguing that the laws of exponents continue to apply, and for that matter that a real power of x is well-defined by taking a limit on rational exponents: Thm. (Laws of Exponents, Real Powers) Let r > 0 be a real number, which is the limit of a sequence of positive rational numbers {r_i}. Then for any x > 0: f(x) = limit x^r_i i --> oo exists and is the same for any positive rational sequence {r_i} chosen. Moreover the laws of exponents hold for real powers r,s and positive real bases x,y: i) (x^r)^s = x^(rs) ii) (x^r)(x^s) = x^(r+s) iii) (x^r)(y^r) = (xy)^r Proof: It suffices to show the limit f(x) exists, to show that f(x) = x^r is well-defined, independent of the choice of rational sequence converging to r. For if two positive rational sequences both converge to r, we can combine them, interlacing them as odd and even entries into one sequence whose limit must then be common to both subseqences. In particular if r is actually a rational number, our "new" definition must secretly agree with the old one by virtue of considering the constant sequence r_i = r. We claim that {x^r_i} is a Cauchy sequence of real numbers, which is sufficient by the Completeness Property to show convergence. The logic is: (1) Since {r_i} converges to r, {r_i} is a Cauchy sequence. That is, given any epsilon > 0, there exists M such that for all i,j > M, \|r_i - r_j\| is always less than epsilon. (2) In Prop. 2 (ii) above we showed that for any x > 0, the sequence {x^(1/n)} converges to 1. So for fixed x we can specify N such that by the monotonicity shown in Prop. 1: rational s in (0,1/N) ==> \|x^s - 1\| < epsilon for any desired epsilon > 0. (3) Putting both facts together, for fixed x > 0, there exists for any epsilon > 0 an integer M such that for all i,j > M we have \|r_i - r_j\| less than some 1/N which guarantees: \|x^r_i - x^r_j\| < \|x^\|r_i - r_j\| - 1\| * min(x^r_i,x^r_j) < epsilon * C where C is an upper bound on {x^r_i}, say x^R where R is an upper bound on {r_i} if x > 1, or simply 1 if x <= 1. Since epsilon can be as small as we please, this shows the sequence {x^r_i} is Cauchy, and thus convergent. Once we have the definition of x^r as a limit from the rational exponent cases, the laws of exponents (i)-(iii) follow easily. Let us show the third of these in some detail: (x^r)(y^r) = ( limit x^r_i ) ( limit y^r_i ) i --> oo i --> oo = limit (x^r_i)(y^r_i) i --> oo = limit (xy)^r_i i --> oo = (xy)^r where we've used only that a product of two limits which exists is the limit of corresponding products, together with the previously established law of exponents for the rational case. Parts (i) and (ii) are similar. QED Thm. (Power Rule for Positive Real Exponents) Let r > 0 be a real number. Then f(x) = x^r is a continuous, monotone increasing function from positive real numbers to positive real numbers with inverse g(x) = x^(1/r). Also f is differentiable, and: f'(x) = r * x^(r-1) Proof: Having developed all the "machinery" above, it is now straightforward to prove the power rule continues to hold for positive real exponents. Of course if r is rational, we are already done. So let's assume r is irrational. One way to show f(x) = x^r is continuous and increasing is to jump right into show that it is differentiable with positive derivative. For example the laws of exponents allow us to reduce the question of the derivative of f'(x) for general x to that of the derivative at x = 1: (x + h)^r - x^r f'(x) = limit ----------------- h --> 0 h (1 + h/x)^r - 1 = limit ----------------- * x^(r-1) h --> 0 h/x = f'(1) * x^(r-1) This simplification isn't essential, as the way we are about to show f'(1) = r would really work for any argument x, but it will make the notation and (hopefully) the presentation clearer. Recalling the monotonicity properties of Prop. 1, it should be evident that for rational sequences {r_i} converging to r from above and {s_i} converging to r from below, we have: for all i, x > 1 implies x^s_i < x^r < x^r_i x = 1 implies x^s_i = x^r = x^r_i x < 1 implies x^s_i > x^r > x^r_i In other words the graph of f(x) = x^r is "pinched" between the family of curves x^s_i and x^r_i. Since their curves are strictly monotone increasing, the curve x^r must also be increasing at 1. In particular since for h > 0: (1 + h)^s_i < (1 + h)^r < (1 + h)^r_i (1 - h)^s_i > (1 - h)^r > (1 - h)^r_i we can "squeeze" the limits of the difference quotients: (1+h)^r - 1 f'(1) = limit ------------- = r h --> 0 h because both "side" limits as i --> oo agree: (1+h)^r_i - 1 limit ( limit --------------- ) = limit r_i = r i --> oo h --> 0 h i --> oo (1+h)^s_i - 1 limit ( limit --------------- ) = limit s_i = r i --> oo h --> 0 h i --> oo Therefore in general f'(x) = f'(1) * x^(r-1) = r * x^(r-1). The demonstration that g(x) = x^(1/r) is the inverse function to f(x) = x^r is an even more immediate application of the laws of exponents, namely part (i) of the preceding Theorem: g(f(x)) = (x^r)^(1/r) = x^(r * 1/r) = x^1 = x QED We finish up by filling in the gap for negative exponents. Corollary (Power Rule for All Real Exponents) If we extend the definition f(x) = x^r to r < 0 by allowing the limit of a general sequence of rational numbers r_i --> r, then the power rule and other properties continue to hold, the only difference worth mentioning is that when r < 0, f(x) is monotone decreasing. Proof: Since the limit of reciprocals of a sequence converging to a nonzero limit is the reciprocal of that limit, the result of defining: f(x) = limit x^r_i i --> oo for a sequence of negative rational numbers converging to r < 0 is the same as: limit x^r_i = limit x^-\|r_i\| i --> oo i --> oo = 1 / limit x^\|r_i\| i --> oo = 1 / x^\|r\| so as before we can take the derivative of f(x) by applying the simplified quotient rule: f'(x) = -\|r\| * x^(\|r\|-1) / x^\|2r\| = r * x^(\|r\| - 2\|r\| - 1) = r * x^(-\|r\| - 1) = r * x^(r-1) QED