Google Answers Logo
View Question
Q: Chicken and Egg: Calculus and the Binomial Theorem ( Answered 5 out of 5 stars,   1 Comment )
Subject: Chicken and Egg: Calculus and the Binomial Theorem
Category: Science > Math
Asked by: upstartaudio-ga
List Price: $25.00
Posted: 16 Dec 2004 18:07 PST
Expires: 15 Jan 2005 18:07 PST
Question ID: 443702
My question regards the general form of the binomial theorem where the 
exponent n is not an integer, sometimes called the Binomial Function.  
I've seen numerous proofs of properties of infinite series and equations
for derivatives which require that the Binomial Function is already
proved.  But when I look into it, the proof of the general binomial
theorem requires calculus.  So the logic seems circular.  Does anyone
know of a proof of the binomial theorem that does not require calculus
or infinite series that in turn depend on the binomial theorem?

Alternatively, a development of calculus which begins with the
antiderivative so that the Gamma function can be used to define the
intermediate (non-integer exponents) of the binomial coefficient

Obviously, estimating the binomial function with a taylor series
immediately and directly depends on the polynomial rule, which in
every proof I've seen relies on the validity of the general binomial

Any ideas?

Request for Question Clarification by mathtalk-ga on 17 Dec 2004 06:39 PST
Hi, upstartaudio-ga:

To expand (x + h)^n when n is not a positive integer will require
something more than a finite series, right?  It can be expressed as an
infinite series, which I take is what you mean by the Binomial
Function or Binomial Theorem where "n is not an integer".

However Taylor's Theorem with Remainder does not explicitly involve an
infinite series, just a finite series (of indefinite length) plus a
remainder term.  It is basically a generalization of the Mean Value
Theorem to higher order derivatives.

So I think that a "proof" of Taylor series expansions (ie. their
convergence) based on Taylor's Theorem with Remainder is not a
circular argument in the sense you seem to suspect.  It is true that
"binomial" coefficients appear in Taylor's Theorem with Remainder, as
one might expect with a "finite" expansion that establishes the
convergence of an infinite series.

Does this help?

regards, mathtalk-ga

Clarification of Question by upstartaudio-ga on 17 Dec 2004 09:50 PST
Quite right, I should have been more specific.  I supposed that the
infinite series was developed by extending the finite taylor series
until the final term was smaller than some fixed error.  But so much
for my own lack of rigor, I am not a Mathematician so you can safely
ignore everyting I say...

I don't think it matters whether you prove the chicken (general
binomial function without using calculus) or the egg (derivative rule
for non-integer exponents without using the binomial function).  I
read somewhere that Euler had this difficulty and came up with his own
inventive infinite series-based proof, but I'll be darned if I can
find it.

Because I am self=taught, I suppose this is just a question of my
having studied things in the wrong order; I imagine that
Mathematicians are taught the secret to resolving this (apparently)
circular dependence between the derivative rule for non-integer
exponents and the general binomial function (very many infinite series
proofs also use the binomial function as well).  So while it would be
nice to understand the proof itself, I suppose my question is more of
the nature does such an indepenedent proof exist?

I'm currently reading a very thin, (and to me very difficult) book on
infinite sequences and series by Konrad Knopp.  Many examples are
given without proof which can be demonstrated using a binomial
expansion and vanishing remainder term.  After the third such example,
I remembered being flustered when I studied  calculus many years ago,
because at that time when I looked up a proof for the general binomial
theorem, it depended on the derivative rule for non-integer exponents.
 I thought I'd ask the question here to see if anyone knows how
Mathematicians get around this, or whether it remains unsolved.

There is a beautiful inductive proof for the binomial expansion, but
it begins with the case where n=1, so it is only valid for integer

If there isn't actually a problem with using the two to prove each
other, a clarification of my misaprehension would be both sufficient
and appreciated.  A proof of the binomial expansion for non-integer
exponents which does not depend on the polynomial rule of calculus
(particularly one I can understand) would be the best possible answer
to this question.  A proof of the polynomial rule for noninteger
exponents that doesn't make use of the binomial expansion would also
suffice.  Finally if I'm asking for the proverbial horse of a
different color, then then an explanation of why mathematicians don't
worry about this would suffice.

Request for Question Clarification by mathtalk-ga on 17 Dec 2004 11:10 PST
Thanks for the clarification.  It seems that you are interested in a
rigorous development of both the "binomial theorem" for non-integer
exponents and a rule for differentiation:

  ---- = r * x^(r-1) when x > 0   [Power Rule]

also for non-integer exponents, with special attention to showing that
the reasoning is not circular.

It is quite correct that many infinite series expansions, and in
particular Taylor series expansions, depend in full generality upon a
prior development of calculus topics like derivatives.

Thus the Power Rule is the more elementary of these topics, but even
so one needs to define what (if anything) is meant by a function f(x)
= x^r when r is not an integer.

If I've understood the Question correctly, I believe it can be
answered by first providing a definition of the real-valued power
function x^r and then proving the Power Rule outlined above without
invoking any infinite binomial expansions.

regards, mathtalk-ga

Clarification of Question by upstartaudio-ga on 18 Dec 2004 01:59 PST
Yes, you've understood my question exactly.

Request for Question Clarification by mathtalk-ga on 19 Dec 2004 19:24 PST
Hi, upstartaudio-ga:

Please see the extended Comment I've posted below, which outlines the
steps needed to develop the Power Rule from first principles and lays
the foundation for what will most interest you, the noninteger
exponent cases, by proving the integer exponent cases.  Please review
it and let me know if the level of exposition needs to be adjusted.

We must next tackle the case of rational exponents, x^(p/q) for
suitable integers p and q.  This we will define as (x^p)^(1/q), ie.


where f(x) = x^q.  Already in the Comment below we have established
the existence, monotonicity, and continuity of this (f^-1) function,
at least for q > 0.  It remains only to show differentiability, and
then we shall have the Power Rule for rational exponents.

Last we will tackle the case of real exponents.  To do so we invoke
the density theorem, that every real number is the limit of a sequence
of rational numbers.  For example, the decimal expansion of a real
number gives one such sequence of rational numbers that converge in
that real number in the limit.

A further extension is possible, to complex exponents.  However the
work needed to make that extension is slight in comparison to the
details of the rational and real extensions.

regards, mathtalk-ga

Clarification of Question by upstartaudio-ga on 19 Dec 2004 23:31 PST
The comment below is quite clear.  I think I see where this is leading now.

From the integer case, I know how to (non-rigorously) find the
derivative of the inverse:

let y = x^(1/n) so that x = y^n.  Then take the derivative with respect to y:

d/dy y^n = d/dy x

n*y^(n-1) = dx/dy

Find the derivative of the inverse:

1/(n*y^n-1) = dy/dx

1/n * y^(1-n) = d/dx y

then by substituting y = x^(1/n), I get 

d/dx x^(1/n) = 1/n * x^(1/n - 1).

I had a feeling like my old calculus teacher was going to bean me with
the chalk as I took the reciprocal of dx/dy...  Are there some things
I needed to establish first beyond requiring n <> 0?  I think you've
already established that the inverse function exists, and that it is
continuous except at n=0.  Do we need to do something else to know
that it is differentiable?

Request for Question Clarification by mathtalk-ga on 20 Dec 2004 09:13 PST
Hi, upstartaudio-ga:

Your calculation is spot on.  The "intuitive" approach:

  dx/dy = 1 / (dy/dx)

suggested by Leibniz's notation for derivatives is valid, assuming
that y(x) has an inverse function and that the (pointwise) derivative
dy/dx has a reciprocal.

For our purposes we will be fine with this, because we exclude x = 0. 
But the usual example of what can "go wrong" is:

  y = x^3

This function is continuous, strictly monotone increasing, and
differentiable on the whole real line.  The inverse function is also
continuous and strictly monotone increasing on the whole real line,
but fails to have a derivative at x = 0.  Note that 1/(dy/dx) cannot
be defined there.

I'll describe the "rigorous" basis for this in my Answer, but if you
want to read ahead, Google around for Inverse Function Theorem.

regards, mathtalk-ga

Clarification of Question by upstartaudio-ga on 20 Dec 2004 11:42 PST
This looks like the answer then.  When I learned the power rule, the
"proof" given used the binomial expansion to prove cases where n >= 2.
 While I feel that has instructional value owing to its simplicity, it
doesn't provide much of a foundation for future work.

In the ongoing debate of whether to teach calculus by infinitesimals
(the 'old' way) or by limits (the 'new' way), I suppose that both have
advantages.  The former because it is very easy to explain the basic
concepts.  The latter because it provides a better foundation for
further study, and because 99% of calculus courses are taught that

I was once invited to give a semester lecture series at my old high
school (which doesn't have an AP calculus program) to a handful of
bright students who wanted some preparation before entering college. 
Since I use basic differential calculus in my work in electronics, I
accepted the challenge and used Thompson & Gardner's "Calculus made
easy" as the text.

We met only once a week, but by the end of the semester we all knew
how to find derivatives of most equations we were likely to run into,
so as a practical course I suppose it was a success.  But as a math
course, I feel like I failed the students because they might find
later work confusing; we spent almost no time at all on limits and
differentiability (due to my own weakness in these areas).

Perhaps they are better off having been exposed to the concepts, but
perhaps they are worse off for having been deprived of a solid
foundation for future work.  The debate goes on.  But for any of them
that one day questions whether what they learned is true, it is
comforting to know that they will likely find the answers they seek.

Should the opportunity arise in the future, I think it would be
better, as well as more intuitive, to develop the power rule from the
product, quotient, and inverse function rules.  Or possibly, prove the
cases x^0, x^1, and x^2, introduce the power rule but delay the proof
until we've covered the inverse function rule, then return and retrace
the steps of your proof given below.

Your final answer will become part of my instruction file, and I owe
you a debt of gratitude for your precise effort in answering my
Subject: Re: Chicken and Egg: Calculus and the Binomial Theorem
Answered By: mathtalk-ga on 15 Jan 2005 18:05 PST
Rated:5 out of 5 stars
Hi, upstartaudio-ga:

Calculus is a difficult clase to master, either from the teacher's or
the student's perspective.  Its utility pushes it of necessity to the
beginning of an undergraduate curriculum, but thorough understanding
of rigorous foundations are ordinarily deferred to later.

The generalized binomial theorem may be deduced as a special case of
Taylor series, for powers x^r where r is other than a nonnegative

So a key development is the definition of such a function x^r for
general exponents, satisfactory enough to use to deduce its
derivatives.  Because the power rule:

  d(x^r)/dx = r * x^(r-1)

states the first derivative of x^r in simple terms of a constant times
a similar function, establishing the form of the first derivative
yields by induction the forms of all higher derivatives.  The big
distinction between nonnegative integer r and other exponents is that
in that former case, the higher derivatives must eventually descend to
zero, so that a Taylor series expansion (regardless of the choice of
"center") will have finitely many terms if and only if the exponent r
is a nonnegative integer.

Here's an outline of the steps we will follow:

1.  Recap of properties of function x^r and its inverse for integer r

2.  Differentiability of the r'th root function (inverse of x^r)

3.  Well-definedness of x^r when r = p/q is a ratio of nonzero integers

4.  Limits of real exponents as Cauchy sequences of rational exponents


Clarification of Answer by mathtalk-ga on 15 Jan 2005 22:37 PST
1.  Previously functions f(x) = x^r and its inverse g(x) were shown
for integer exponents r to have the following properties on the domain
of positive real x:

(i)   The function f(x) = x^r is positive, and:

        strictly monotone increasing if r > 0,
        constantly 1 if r = 0, and
        strictly monotone decreasing if r < 0.

(ii)  The function f(x) = x^r is continuous, and provided
      r nonzero, g(x) is continuous (on the same domain).

(iii) The function f(x) = x^r is differentiable, and:

        f'(x) = r * x^(r-1).

This third property is precisely the power rule we seek to establish
for all r, but so far we've only proved it for integers.  Recall that
we took an approach of proving the rule for positive r, then deducing
the rule for negative r.  The same technique can be applied when r is
not an integer, leaving us free to concentrate on results for positive
r (where the functions x^r will again be monotone increasing).

The immediate open question is the differentiability of g(x), to which
we next turn.

2.  A simple version of the Inverse Function Theorem suffices for our needs.

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *
Thm. (Inverse Function Theorem) Suppose f:[a,b]->[c,d] is a real
function with a positive derivative f' at each point of [a,b].  Then
its inverse g:[c,d]->[a,b] exists and has a positive derivative g' as
well (at each point of [c,d]), and the following is true:

  g'(f(x)) =  1/f'(x)

for all x in [a,b].  Correspondingly:

  g'(y) = 1/f'(g(y))

for all y in [c,d].
*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Before we launch into the proof, let's see the implication of this for
the function f(x) = x^r.  For integer r > 0, f:[a,b]->[a^r,b^r] has a
positive derivative at any point.  Therefore its inverse does, and:

  g'(f(x)) = 1/f'(x)

           = 1/(r * x^(r-1))

and if we take x to be g(y), the r'th root of y, then:

  g'(y) = 1/(r * (g(y))*(r-1))

If we were to allow ourselves to write g(y) = y^(1/r), that notation
plus the laws of exponents would lead to (1/r)*( y^((1/r)-1) ). 
However this formal acknowledgement of the power rule for y^(1/r) is a
bit premature.  For the moment we are content to claim only that our g
is differentiable and has a positive derivative.

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *
Proof of Theorem:

Since g(f(x)) = x, if we knew that g was differentiable, the desired
consequence could be obtained by applying the chain.  But since we
want to prove g is differentiable, we should drill down the definition
of g' as a limit:

                  g(y+h) - g(y)
  g'(y) = limit  ---------------
          h -> 0        h

We can analyze this limit by introducing a sequence {h_i} of real
numbers that tend to 0, and produce a corresponding sequence {x_i} by

  x_i = g(y + h_i)

for any fixed y in the domain.  The endpoints y = c,d require slightly
special handling, as the derivatives, etc. at these points involve
one-sided limits.  But these are adequately handled by taking the
sequence {h_i} to approach zero from above for y = c, and from below
for y = d, so that {x_i} approaches a from above (resp. b from below).
 Apart from this detail the arguments for the one-sided derivatives at
the endpoints are the same as the two-sided limit argument we about to
detail for the interior point cases.

Since y + h_i = f(x_i) by definition of x_i and f(g(y)) = y, we can
conclude that the limit can be rewritten:

          g(y+h_i) - g(y)                x_i  -  x
  limit  ---------------  =  limit   ----------------
 h_i -> 0      h_i          x_i -> 0   f(x_i) - f(x)

which we recognize to be the reciprocal of the limit defining f'(x):

                  f(x_i) - f(x)
  f'(x) = limit   ---------------
         x_i -> 0   x_i  -  x

Thus, provided y = f(x), we know the latter positive limit guarantees
the limit exists for its positive reciprocal, and the two limits are

  g'(y) = 1 / f'(x)  =  1/f'(g(y)).

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

This proves the Theorem and in turn suffices to establish the
differentiability of g, the inverse of f(x) = x^r, at least for
positive integers r (so that both f and g are increasing functions).

It is therefore certain, by the Chain Rule for instance, that if we
were to define x^r for some ratio r = p/q of two positive integers,
that the composition:

  f_p( g_q(x) )

where f_p(x) = x^p and g_q(x) is the inverse of f_q(x) = x^q, the
composition would be differentiable.  In fact, bearing in mind that we
assume p.q > 0, the composition would be a differentiable,
monotone-increasing continuous function that is continuously (even
differentiably) invertible by montone increasing:

  f_q( g_p(x) )

Before we allow ourselves the luxury of restating these as a form of
the Power Rule, it is important to justify the notations:

  f_p( g_q(x) )  =  x^(p/q)

  f_q( g_p(x) )  =  x^(q/p)

and the "rational exponent" extensions of the laws of exponents in particular.

This then is the purpose of our next discussion.

(to be continued -- mathtalk-ga)

Request for Answer Clarification by upstartaudio-ga on 16 Jan 2005 16:08 PST
A final comment might be in order.  We've shown the validity of the
power rule without invoking the binomial theorem, and demonstrated
that it holds for the rational and real cases as long as x >= 0.

One of my textbooks ignores, while the other covers, the fact that x^r
is not defined when x < 0, therefore the function isn't differentiable
in that case.

To prove it is undefined, we need only notice that if r = a/b, then it
is also equal to (2*a) / (2*b) or (3*a) / 3*b).  Now if b is odd, then
x^r is an even root raised to a power, and if b is even, then x^r is
an odd root raised to a power.  So if x is negative, one of these
cases isn't defined in the set of real numbers.

The problem doesn't seem to go away with x complex, either, since
there are infinitely many solutions if r is the limit of an irrational
number.  Even if we choose one of these roots by definition, we still
have the ugly situation that the (qth root of x) quantity raised to
the power p is not the same number as the qth root of x^p.

So I don't see that we can get around restricting the domain of x to
positive numbers.  But that is the subject of another question...

Clarification of Answer by mathtalk-ga on 17 Jan 2005 21:27 PST
3.  We have shown that given rational r = p/q where p,q are
positive integers (so far), the function:

  f_p( g_q(x) )

where f_p(x) = x^p and g_q(x) is the inverse of x^q, has many
properties (monotonicity, continuity, differentiability) in
common with the powers x^r for integer r.

But because there are multiple ways to express any rational r
as a ratio of two integers, we need to show that all possible
choices lead to the same function x^r.

The mathematical shorthand says we need to show that:

  x^r  =  f_p( g_q(x) )

is well-defined, ie. that the formula's apparent dependence
on the choice of p,q is only superficial.

The key to this is purely algebraic, namely showing that the
various functions f_p and g_q all commute, so that the order
in which they are composed is immaterial to the final result.

To begin with, the power functions f_p and f_q commute for
any two positive integer powers by the associative law of
multiplication and mathematical induction:

f_p( f_q(x) ) = (x^q)^p = x^(pq) = (x^p)^q = f_q( f_p(x) )

Stated another way, this gives an unfamilar cast to a law of

f_p o f_q = f_pq.= f_qp = f_q o f_p

From this it can be deduced that the corresponding function
inverses also commute, because where inverses exist for two
functions, the inverse of their composition is the result of
composing their inverses in the opposite order.  For example:

g_q( g_p( f_p( f_q(x) ) ) ) = x

when the adjacent inverses "cancel" one another, which says
that g_q( g_p(x) ) is the inverse of f_p( f_q(x) ).

It follows then that g_q and g_p must commute, because f_p
and f_q commute:

g_q o g_p = (f_p o f_q)^-1 = (f_q o f_p)^-1 = g_p o g_q

We further verify that f_p and g_q commute as well, so that
it really doesn't matter whether we define x^r by composing
them in one order or the other.  Again we use the fact that
f_p and f_q commute (with respect to function composition):

f_q( f_p( g_q(x) ) ) = f_p( f_q( g_q(x) ) = f_p(x)

Therefore after applying g_q to both sides:

f_p( g_q(x) ) = g_q( f_p(x) )

which demonstrates f_p commutes with g_q.

One point that we've been a bit cavalier about is that g_q
is both a left and a right inverse for f_q.  That is:

f_q( g_q(x) ) = x = g_q( f_q(x) )

Whenever a function maps its domain 1-1 and onto itself, an
inverse is two-sided.  This symmetric outcome may easily be
deduced from the symmetry of the relations:

  y = f_q(x)  <==>  x = g_q(y)

where for any x there exists y to satisfy the condition, and
conversely for any y there exists such an x.

In any case these commutativity properties establish that x^r
is well-defined, because if we take r = (cp)/(cq), for c any
positive integer:

f_cp( g_cq(x) ) = f_p( f_c( g_c( g_q(x) ) ) ) = f_p( g_q(x) )

This assures us that in the particular case p is divisible by
q, so that r is an integer, our "new" definition of x^r fully
agrees with the old definition f_r based solely on arithmetic.

The more general point of these observations is that we aren't
mislead by using the exponential notation x^r with rational r
and rational s, because the familiar laws of exponents hold:

  (x^r)^s = x^(rs)
  (x^r)*(x^s) = x^(r+s)
  (x^r)*(y^r) = (xy)^r

for any x,y > 0 and rational r,s.  The proofs of all these are
purely algebraic, and for that reason I will not go into more

However we will finish this section with a bit of calculus, a
derivation of the power rule for positive rational exponents,
then using the quotient rule to extend it to negative rational

Recall we have really only dealt with r > 0 in defining:

  x^r = f_p( g_q(x) )

when r = p/q and p,q > 0 are integers.  The Chain Rule and the
Inverse Function Theorem then give us:

  d(x^r)/dx = f'_p( g_q(x) ) * g'_q(x)

            = p * (g_q(x))^(p-1) * [1/f'_q( g_q(x) )]

            = p * x^((p-1)/q) * (1/q) * (1/x^((q-1)/q))

            = (p/q) * x^((p-q)/q)

            = (p/q) * x^((p/q)-1)

            = r * x^(r-1)

That is, we've shown the Power Rule for rational r > 0.

Now we've already used in the computation above that dividing
by x^r is equivalent to multiplying by x^-r, so it may be 
worth pointing out that the commitment to treat negative
exponents as reciprocals is implied by the second law of
exponents cited above, with s = -r:

  (x^r)*(x^-r) = x^0 = 1

Therefore on the calculus side of things we need only apply
the quotient rule to determine the derivative of x^-r:

  d(x^-r)/dx  =  d(1/(x^r))/dx

              = -  ---------

              = -r * x^(r-1) * x^(-2r)

              = -r * x^(-r-1)

With this we've also shown the Power Rule for rational r < 0.

This is exactly the same calculation as we gave before on the
integer exponents, but as acknowledged above, some algebraic
preliminaries were necessary to assure they are sensible for 
the rational exponents.

At last we come to our final step, extending our exponents to
the general real case by taking limits of Cauchy sequences of
rational exponents.  It should not be surprising that we can
show, once power functions x^a are "pinched" between monotone
power functions x^r both above and below, that x^a must also
be monotone, etc.

(to be continued)

Clarification of Answer by mathtalk-ga on 25 Jan 2005 20:21 PST
4.  A "construction" of the real numbers in mathematics is
often based on Cauchy sequences of rational numbers.  For
every real number there is a sequence of rational numbers 
converging to it.  For example, even though SQRT(2) or pi
is irrational, their decimal expansions give us (by way of
truncation) convergent sequences of rational numbers.

If the last section was top-heavy with algebra, this one is
top-heavy with analysis, ie. with fussing over limits and
how to estimate sizes of things.

So far we've defined power functions x^r for all rational
exponents r and determined that their derivatives obey the
Power Rule:

  d(x^r)/dx = r * x^(r-1)

To extend our definition to real, irrational exponents a, we
need to take the limit of x^r as r approaches a.  In doing so
we will make free use of the exponent notation and the usual
"laws of exponents" for rational r, whose justification was
sketched in the previous section.

We state without proof the following:

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Thm. (Completeness Property of the Real Numbers)

Let {r_i} be a Cauchy sequence of real numbers.  That is,
for every epsilon > 0, there exists integer M > 0 such that
for all i,j > M,  |r_i - r_j| < epsilon.

Then the sequence {r_i} converges to a real number.

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

The essential reason the real numbers have this property is
because we "bake in" that property with their construction.
The real numbers are the "completion" of the rationals with
respect to the usual notion of distance between to numbers,
the absolute value of the difference.  So at any rate every
Cauchy sequence of rationals has a unique limit in the real
numbers, and the extension of this fact to Cauchy sequences
of real numbers is the "completeness property".

For our purposes we need to show that if {r_i} is a Cauchy
sequence of rational numbers, then for any fixed real x > 0,
{x^r_i} is a Cauchy sequence of real numbers.

It suffices to have an estimate of |x^r_i - x^r_j| in terms
of |r_i - r_j|.  Intuitively, making the exponents close to
one another puts the corresponding powers of x close to one

x^r_i - x^r_j = [x^(r_i - r_j) - 1] * x^r_j

Prop. 1  Let r > s be rational numbers and x > 0 be real.


   i) if x > 1, then x^r > x^s
  ii) if x = 1, then x^r = x^s
 iii) if x < 1, then x^r < x^s

Proof:  This harkens back to something we showed earlier in
the Comment.  Certainly for any positive integer n, x > 1 if
and only if x^n > 1.  Restated conversely, x > 1 if and only
if x^(1/n) > 1.

First we show that if x > 1, then x^(r-s) > 1.  Since r > s,
we can express r - s with a common denominator as p/q where
p,q are each positive integers.  Then as just recalled:

  x > 1 ==> x^p > 1

        ==> (x^p)^(1/q) > 1

        ==> x^(r-s) = x^(p/q) > 1

which suffices upon multiplying both sides by x^s to show:

  x > 1 ==> x^r > x^s

This proves part (i) of the Proposition.  Part (ii) is trivial.
Part (iii) follows from part (i) by applying it to 1/x, since
taking reciprocals of positive numbers reverses the direction
of an inequality.


The result above establishes that for fixed x > 0, the values
x^r vary monotonically with r, a nice counterpart to our earlier
treatment of monotonicity in x for fixed r.  One other result is

Prop. 2  Let x > 1 be a real number.  Then

  i)  {x^n: n = 1,2,3,...} increases without limit.
 ii)  {x^(1/n): n = 1,2,3,...} converges to 1


(i) Clearly x > 1 implies:

  x < x^2 < x^3 < ...
so the sequence in part (i) is strictly increasing.  Therefore
it either increases without limit (tend to +oo), or it must have
as a limit a least upper bound (a fact which can be rigorously
deduced from the Completeness Property of Real Numbers), say u.
Since x > 1, u/x < u and therefore some integer n is such that:

  x^n > u/x

But then x^(n+1) > u, contradicting that u was an upper bound.
Thus the sequence increases without limit (tends to +oo).

(ii) It is a little less obvious, but true, that the sequence:

  x > x^(1/2) > x^(1/3) > ...

is monotone decreasing.  Let m < n be two positive integers,
so that assuming x > 1 still, x^m < x^n.  Now the mn'th root
function is monotone increasing so applying it to both sides
of that inequality gives:

  x^(1/n) = (x^m)^(1/mn) < (x^n)^(1/mn) = x^(1/m)

so x^(1/m) > x^(1/n) when m < n, as desired.

Furthermore since x > 1, we know x^(1/n) > 1^(1/n) = 1, and
thus 1 is a lower bound on the root sequence.  It remains to
show that 1 is a greatest lower bound and therefore the limit
of the monotone decreasing sequence of roots.

Suppose instead that b > 1 is also a lower bound of x^(1/n)
for all positive integers n:

  b < x^(1/n)

Now b^n < x for all integers n.  In other words b > 1, but
then sequence {b^n} has finite upper bound x, which contradicts
part (i) of this proposition.  So no such lower bound b > 1
exists, and the greatest lower bound of {x^(1/n)} is 1.  As the
sequence is monotonic, once the sequence is within epsilon > 0
of 1, it remains "within epsilon" of 1, so 1 is the limit to
which the sequence converges.


Before we reach for the climatic proof of the power rule for
real exponents, let's first warm up by arguing that the laws
of exponents continue to apply, and for that matter that a
real power of x is well-defined by taking a limit on rational

Thm. (Laws of Exponents, Real Powers)

Let r > 0 be a real number, which is the limit of a sequence
of positive rational numbers {r_i}.  Then for any x > 0:

  f(x) = limit   x^r_i
        i --> oo

exists and is the same for any positive rational sequence
{r_i} chosen.  Moreover the laws of exponents hold for real
powers r,s and positive real bases x,y:

  i)  (x^r)^s = x^(rs)

 ii)  (x^r)*(x^s) = x^(r+s)

iii)  (x^r)*(y^r) = (xy)^r

Proof:  It suffices to show the limit f(x) exists, to show
that f(x) = x^r is well-defined, independent of the choice
of rational sequence converging to r.  For if two positive
rational sequences both converge to r, we can combine them,
interlacing them as odd and even entries into one sequence
whose limit must then be common to both subseqences.  In
particular if r is actually a rational number, our "new"
definition must secretly agree with the old one by virtue
of considering the constant sequence r_i = r.

We claim that {x^r_i} is a Cauchy sequence of real numbers,
which is sufficient by the Completeness Property to show
convergence.  The logic is:

(1) Since {r_i} converges to r, {r_i} is a Cauchy sequence.
That is, given any epsilon > 0, there exists M such that
for all i,j > M, |r_i - r_j| is always less than epsilon.

(2) In Prop. 2 (ii) above we showed that for any x > 0, the
sequence {x^(1/n)} converges to 1.  So for fixed x we can
specify N such that by the monotonicity shown in Prop. 1:

  rational s in (0,1/N)  ==>  |x^s - 1| < epsilon

for any desired epsilon > 0.

(3) Putting both facts together, for fixed x > 0, there
exists for any epsilon > 0 an integer M such that for all
i,j > M we have |r_i - r_j| less than some 1/N which 

  |x^r_i - x^r_j| < |x^|r_i - r_j| - 1| * min(x^r_i,x^r_j)

                  <    epsilon  *  C

where C is an upper bound on {x^r_i}, say x^R where R is
an upper bound on {r_i} if x > 1, or simply 1 if x <= 1.

Since epsilon can be as small as we please, this shows
the sequence {x^r_i} is Cauchy, and thus convergent.

Once we have the definition of x^r as a limit from the
rational exponent cases, the laws of exponents (i)-(iii)
follow easily.  Let us show the third of these in some

   (x^r)*(y^r)  = ( limit  x^r_i ) * ( limit  y^r_i )
                   i --> oo           i --> oo

                =   limit  (x^r_i)(y^r_i)
                   i --> oo

                =   limit  (xy)^r_i
                   i --> oo

                = (xy)^r

where we've used only that a product of two limits which
exists is the limit of corresponding products, together
with the previously established law of exponents for the
rational case.  Parts (i) and (ii) are similar.


Thm. (Power Rule for Positive Real Exponents)

Let r > 0 be a real number.  Then f(x) = x^r is a continuous,
monotone increasing function from positive real numbers to
positive real numbers with inverse g(x) = x^(1/r).  Also f is
differentiable, and:

  f'(x) = r * x^(r-1)

Proof:  Having developed all the "machinery" above, it is now
straightforward to prove the power rule continues to hold for
positive real exponents.  Of course if r is rational, we are
already done.  So let's assume r is irrational.

One way to show f(x) = x^r is continuous and increasing is to
jump right into show that it is differentiable with positive
derivative.  For example the laws of exponents allow us to
reduce the question of the derivative of f'(x) for general x
to that of the derivative at x = 1:

                   (x + h)^r - x^r
  f'(x) =  limit  -----------------
          h --> 0         h

                   (1 + h/x)^r - 1
        =  limit  ----------------- * x^(r-1)
          h --> 0         h/x

        =  f'(1) * x^(r-1)

This simplification isn't essential, as the way we are about to
show f'(1) = r would really work for any argument x, but it will
make the notation and (hopefully) the presentation clearer.

Recalling the monotonicity properties of Prop. 1, it should be
evident that for rational sequences {r_i} converging to r from
above and {s_i} converging to r from below, we have:

  for all i,  x > 1 implies x^s_i < x^r < x^r_i

              x = 1 implies x^s_i = x^r = x^r_i

              x < 1 implies x^s_i > x^r > x^r_i

In other words the graph of f(x) = x^r is "pinched" between the
family of curves x^s_i and x^r_i. Since their curves are strictly
monotone increasing, the curve x^r must also be increasing at 1.

In particular since for h > 0:

      (1 + h)^s_i < (1 + h)^r < (1 + h)^r_i

      (1 - h)^s_i > (1 - h)^r > (1 - h)^r_i

we can "squeeze" the limits of the difference quotients:

                     (1+h)^r - 1
  f'(1)  =   limit  -------------  =  r
            h --> 0        h

because both "side" limits as i --> oo agree:

                    (1+h)^r_i - 1
  limit   (  limit  ---------------  ) =  limit   r_i  =  r
 i --> oo   h --> 0         h            i --> oo

                    (1+h)^s_i - 1
  limit   (  limit  ---------------  ) =  limit   s_i  =  r
 i --> oo   h --> 0         h            i --> oo

Therefore in general f'(x) = f'(1) * x^(r-1) = r * x^(r-1).

The demonstration that g(x) = x^(1/r) is the inverse function to
f(x) = x^r is an even more immediate application of the laws of
exponents, namely part (i) of the preceding Theorem:

  g(f(x)) = (x^r)^(1/r) = x^(r * 1/r) = x^1 = x


We finish up by filling in the gap for negative exponents.

Corollary (Power Rule for All Real Exponents)

If we extend the definition f(x) = x^r to r < 0 by allowing
the limit of a general sequence of rational numbers r_i --> r,
then the power rule and other properties continue to hold, the
only difference worth mentioning is that when r < 0, f(x) is
monotone decreasing.

Proof:  Since the limit of reciprocals of a sequence converging
to a nonzero limit is the reciprocal of that limit, the result
of defining:

  f(x) = limit   x^r_i
        i --> oo

for a sequence of negative rational numbers converging to r < 0
is the same as:

   limit   x^r_i  =    limit   x^-|r_i|  
  i --> oo            i --> oo

                  =   1 / limit   x^|r_i|
                         i --> oo

                  =   1 / x^|r|

so as before we can take the derivative of f(x) by applying the
simplified quotient rule:

  f'(x) = -|r| * x^(|r|-1) / x^|2r|

        =  r * x^(|r| - 2|r| - 1)

        =  r * x^(-|r| - 1)

        =  r * x^(r-1)

upstartaudio-ga rated this answer:5 out of 5 stars and gave an additional tip of: $25.00
Thank you for the insightful and thorough answer to my question.

Subject: Re: Chicken and Egg: Calculus and the Binomial Theorem
From: mathtalk-ga on 19 Dec 2004 19:04 PST
To prove the well-known rule of derivatives:

d(x^r)/dx = r * x^(r-1)

we must define a function f(x) = x^r on a suitable domain
and establish certain facts about it.

For example, what do we mean by 4^(1/2)?  While 4 has two
square roots, 2 and -2, we would need to define f(x) in a
way that assigns only one value.  A standard choice would
be to use the positive root, but if we ask about negative
values of x, it's unclear that any "standard choice" can
be made for fractional r.  Also, for negative exponents r,
there is even a problem with x = 0.

So let's define f on the domain of positive real numbers.
Note that x = 0 is excluded.

The outline of our development is to define and prove the
properties of f(x) = x^r for successively larger sets of
exponents r:

Case 1: r an integer
Case 2: r is rational
Case 3: r is real

The properties we want to establish are these:

(i)   The function f(x) = x^r is positive, and:

        strictly monotone increasing if r > 0,
        constant if r = 0, and
        strictly monotone decreasing if r < 0.

(ii)  The function f(x) = x^r is continuous and unless
      r = 0 has a continuous inverse on the same domain.

(iii) The function f(x) = x^r is differentiable, and:

        f'(x) = r * x^(r-1).

We shall not fill in every detail, but aim to give as much
as necessary to make clear that no circular reasoning is
involved.  I'll emphasize details involving derivatives or
proving differentiability.  Basic results about limits and
continuity will be assumed or at the least given a deferred
treatment.  In particular I think it's fairly evident that
typical epsilon/delta arguments do not depend on appeals to
the Power Rule or the Binomial Theorem.

Case 1:  f(x) = x^r for integer exponents r

Our main tool for these exponents is proof by induction
to handle all the nonnegative integers.  Negative values
of r will then be treated as reciprocals of the positive

Basis cases

For r = 0, define f(x) = x^0 as the constant function 1.
This positive function is continuous, with derivative:

  f'(x) = 0 = 0 * x^(0-1)

For r = 1, we have f(x) = x.  It is positive because of
the domain restriction, strictly increasing, continuous,
equal to its own inverse, and differentiable, with:

  f'(x) = 1 = 1 * x^(1-1)

Now define f(x) for larger exponents r inductively using:

  x^(r+1) = x * (x^r)

Note that the basis steps for (i)-(iii) are dealt with,
so we have only the induction steps to do in each part.

Induction part (i)

For (i) the induction step is to use prior case x^r > 0
and x > 0 to conclude:

  x^(r+1) = x * (x^r) > 0
ie. the product of positive numbers is again positive.

We also prove that f(x) = x^(r+1) is strictly monotone
increasing.  Suppose that x > y > 0.  By the induction
hypothesis x^r > y^r.  Then:

  x^(r+1) = x * (x^r) > x * (y^r) > y * (y^r) = y^(r+1)

Induction part (ii)

For (ii) the induction step requires a lemma that the
product of two continuous real functions is continuous,
which I can supply if desired.  It depends on knowing
that a limit of a product exists when each factor has
a limit, a fact we will also need for the derivatives.

There are some fine details in showing f(x) = x^(r+1)
has a continuous inverse, and if we were not hurrying
on to noninteger exponents, I would linger over them.

The strict monotonicity of f(x) implies that it is 1-1.
We must also show that f(x) is onto the positive real
numbers.  Let z > 0 be a real number; either z < 1, or
z = 1, or z > 1.  Now f(1) = 1 by induction, and the
cases z < 1 and z > 1 are symmetric, so I'll do one of
them and leave the other as an exercise.

Suppose z > 1.  Then z^(r+1) > z > 1 because z^r > 1,
and in other words f(z) > z > f(1).  The Intermediate
Value Theorem then implies there exists x between z
and 1 such that f(x) = z.  The case z < 1 would be
argued similarly, reversing the inequality directions
as necessary.  Together these establish that f(x) is
onto and thus has a functional inverse (f^-1)(x).

A further observation is that strict montonicity of
(f^-1) follows from that of f, because f(x) > f(y)
is only consistent with x > y (ie. x=y or x < y lead
to a contradiction).

Then continuity of (f^-1) follows easily enough. Let
epsilon > 0 be given, small enough so that interval
(x - epsilon,x + epsilon) contains only positive real
numbers around some fixed x > 0.  Then:

  I = (f(x - epsilon),f(x + epsilon))

is an open interval containing f(x).  Hence delta > 0
exists such that (f(x) - delta, f(x) + delta) is 
contained in I, and by montonicity:

  0 < | y - f(x) | < delta 


  | (f^-1)(y) - x | < epsilon

This shows that (f^-1) is continuous at f(x), which is
a typical point in its domain (of positive real numbers).

Induction part (iii)

For doing our induction on (iii), we need as a lemma
the product rule of differentiation:

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Lemma  Let F(x) = G(x)*H(x) and G,H be differentiable.

Then F is differentiable and:

  F'(x) = G(x)*H'(x) + G'(x)*H(x)


By definition of the derivative:

  F'(x) =   lim   (F(x+h) - F(x))/h
           h -> 0

provided this limit can be shown to exist.

A standard trick of adding and subtracting the
same term:

  F(x+h) - F(x) = G(x+h)*H(x+h) - G(x)*H(x)

                =  G(x+h)*H(x+h) - G(x+h)*H(x)
                  + G(x+h)*H(x) - G(x)*H(x)

tells us that:

 F(x+h) - F(x)     G(x+h)*H(x+h) - G(x+h)*H(x)
 -------------  =  ---------------------------
       h                        h
                     G(x+h)*H(x) - G(x)*H(x)
                   + -----------------------

The left hand side will have a limit as h tends
to zero if both terms on the right have limits
as h tends to zero, and the limit of the left
hand side will be the sum of the two respective
limits of terms on the right.

The limit of the first of these terms is this:

  lim   ( G(x+h)*H(x+h) - G(x+h)*H(x) )/h
 h -> 0

   =  lim    G(x+h)  *  (H(x+h) - H(x))/h
     h -> 0

Here we use that standard result, that the limit of a
product is equal to the product of the limits of its
factors.  The first factor has a limit:

  lim   G(x+h)  =  G(x)
 h -> 0

because G is continuous at x.  The second factor has
a limit because H differentiable means:

  lim   (H(x+h) - H(x))/h  =  H'(x)
 h -> 0 

Therefore the first term on the right hand side above
tends to G(x)*H'(x) as h goes to 0.

The second term on the right hand side is actually
easier, as the common factor H(x) is constant with
respect to the limit on h, and thus it tends to
a limit of G'(x)*H(x) as h goes to 0.

Combining these two limits gives that:

  lim   (F(x+h) - F(x))/h
 h -> 0

exists and equals G(x)*H'(x) + G'(x)*H(x).  QED

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Now we apply the product rule lemma we just proved
to show that derivative exists (and the power rule
holds) for positive integer r by induction:

  f(x) = x^(r+1) = x * x^r

  f'(x) =  x * (r * x^(r-1)) + 1*(x^r)

        =  r * x^r  +  x^r
        = (r+1) * x^r

Negative integer exponents

Having established the cases of r = 0 and positive
exponents r, there remain the cases r < 0:

  x^(-r) =  1/(x^r)

Obviously the reciprocal of a positive number is also
positive.  Furthermore if x > y > 0, where previously
we showed:

  x^|r| > y^|r|

it now follows (for r < 0):

  x^r = 1/x^|r| < 1/y^|r| = y^r

so that f(x) is strictly monotone decreasing.

The reciprocal of a _nonzero_ continuous function is
continuous, so that part of (ii) holds for r < 0 too,
implied by a composition of continuous functions being
continuous.  That is, the function 1/x is continous.

Furthermore 1/x is its own (continuous) inverse, so
that the inverse of f(x) = x^r is continuous as well.

Finally we show differentiability of f(x) = x^r and
that it satisfies the power rule.  One can consider
f(x) as the composition of two functions, x^|r| and
1/x, and apply the chain rule. Alternatively we can
prove and apply a simplified quotient rule:

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Lemma  Let F(x) = 1/H(x) with differentiable H > 0.

Then F is differentiable and:

  F'(x)  =  - H'(x)/(H(x))^2


We can be a little more concise in presenting this
proof.  Again rewrite the limit of the difference
quotient that defines the derivative F'(x) until we
get the desired result:

                  (1/H(x+h)) - (1/H(x))
  F'(x)  =  lim   ---------------------
           h -> 0           h

                   H(x) - H(x+h)          1
         =  lim   --------------- * -------------
           h -> 0        h           H(x)*H(x+h)

         =     - H'(x) * 1/(H(x))^2


*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

[Note that this simplified quotient rule, having only
a constant numerator, will combine with the product rule
proved earlier to give the full quotient rule.]

Now apply this to a case of negative integer exponent r:

  f(x) = 1/x^|r|,  where r = -|r|
  f'(x) = -|r| * x^(|r|-1) / (x^|r|)^2
        = -|r| * x^(|r| - 1 - 2|r|)
        = -|r| * x^(-|r| - 1)
        = r * x^(r-1)

This completes the treatment of the integer exponents.

regards, mathtalk-ga

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  

Google Home - Answers FAQ - Terms of Service - Privacy Policy