Hi, krackmonk-ga:
I find that I've written a rather lengthy answer, because your
Question encompasses not only the general solution but a request for
the particular solution of two kinds of examples and some suggestions
for software implementation.
Not having worked with you before, I didn't have a terribly clear idea
about the level of writing to aim for, so forgive please any omission
of detail (or inclusion of excessive detail) you may see here.
I'll be glad to Clarify anything that is unclear.
-----------------
Table of Contents
-----------------
I. Introduction
Problem statement, assumptions, and precis of results
II. Theoretical discussion
Proof of existence/uniqueness results and references
III. Examples
Case of two Gaussian functions; case of two piecewise
linear functions; and references
IV. Counterexamples and generalizations
Cases involving weakening of our assumptions
V. Computational Outline
An approach to a software implementation
VI. Summary
Overview of the Answer
I. Introduction
===============
Given a pair of continuous nonnegative functions y(x),g(x) defined
over all real numbers such that for all values of k, the indicated
improper integrals are finite, and assuming that:
(*) there exist real numbers x0 < x1 such that y(x0),g(x1) > 0
then there exists a unique real number k such that:
INTEGRAL x y(x) dx OVER (-oo,k) + INTEGRAL x g(x) dx OVER (k,+oo)
k = -------------------------------------------------------------------
INTEGRAL y(x) dx OVER (-oo,k) + INTEGRAL g(x) dx OVER (k,+oo)
The solution k is shown to be the root of an equation F(k) = 0, where
the function F satisfies the simple second order differential
equation:
F"(k) = y(k) - g(k)
subject to any of a variety of initial value or boundary value
conditions that may be chosen for computational convenience. For
example, the initial conditions at k = 0:
-F(0) = INTEGRAL x y(x) dx OVER (-oo,0) + INTEGRAL x g(x) dx OVER (0,+oo)
F'(0) = INTEGRAL y(x) dx OVER (-oo,0) + INTEGRAL g(x) dx OVER (0,+oo)
Explicitly we then have the general expression, derived from those
initial conditions at k = 0:
F(k) = INTEGRAL ( INTEGRAL (y(x) - g(x)) dx OVER (0,t) ) dt OVER (0,k)
- INTEGRAL x y(x) dx OVER (-oo,0) - INTEGRAL x g(x) dx OVER (0,+oo)
+ k*(INTEGRAL y(x) dx OVER (-oo,0) + INTEGRAL g(x) dx OVER (0,+oo))
The function F is strictly monotone increasing everywhere, so its
root, which exists by a standard continuity arugment (intermediate
value theorem), is unique.
In the particular case that y(x) and g(x) are scaled Gaussian
distributions, F(x) can be expressed in terms of the "error function"
erf(x). However no closed formula for its root is available; one is
still compelled to find the root by a numerical technique.
In the particular case that y(x) and g(x) are piecewise linear
functions, F(x) is a continuous piecewise cubic function. Then root k
is expressible by radicals ("in closed form") in terms of the
coefficients of the cubic polynomial for F on the appropriate
subinterval.
We give a couple of counterexamples to show that assumption (*) cannot
be weakened to allow x0 > x1 or be replaced by an assumption that the
denominator is always positive. However the continuity condition on
y(x) and g(x) could be dropped (retaining nonnegativity and
integrability) if assumption (*) is restated for this purpose.
Finally we outline a computational approach to solving the problem,
blending numerical methods with our analytical insights.
II. Theoretical Discussion
==========================
We begin by rewriting the problem, subject to an assumption that the
denominator is nonzero. We will justify this assumption soon, showing
in particular that assumption (*) on y(x) and g(x) guarantees the
denominator is always positive.
Multiplying both sides by the denominator gives:
k (INTEGRAL y(x) dx OVER (-oo,k) + INTEGRAL g(x) dx OVER (k,+oo))
= INTEGRAL x y(x) dx OVER (-oo,k) + INTEGRAL x g(x) dx OVER (k,+oo)
Any solution k to the original problem will also satisfy this.
Group terms involving y(x) on one side and terms involving g(x) on the other:
k * INTEGRAL y(x) dx OVER (-oo,k) - INTEGRAL x y(x) dx OVER (-oo,k)
= INTEGRAL x g(x) dx OVER (k,+oo) - k * INTEGRAL g(x) dx OVER (k,+oo)
and regard the left-hand side (resp. the right-hand side) as defining
a function Y(k) (resp. G(k)):
Y(k) = k * INTEGRAL y(x) dx OVER (-oo,k) - INTEGRAL x y(x) dx OVER (-oo,k)
G(k) = INTEGRAL x g(x) dx OVER (k,+oo) - k * INTEGRAL g(x) dx OVER (k,+oo)
If we regard the difference of these two functions as defining a third
function, then the value of k which we seek is the root of that:
F(k) = Y(k) - G(k) = 0
It turns out that if y(x) and g(x) are integrable over the indicated
semi-infinite intervals, then F(k) will be a continuous function.
With some assumptions on where y(x), g(x) > 0 it will be possible to
prove that F(k) has a unique root by a continuity argument. For the
moment however let's discuss the differentiability of F(k) under an
assumption of continuity for y(x) and g(x).
Denote differentiation with respect to k by ', and we have:
Y'(k) = INTEGRAL y(x) dx OVER (-oo,k) + ( k * y(k) - k * y(k) )
= INTEGRAL y(x) dx OVER (-oo,k)
G'(k) = ( -k * g(k) + k * g(k) ) - INTEGRAL g(x) dx OVER (k,+oo)
= - INTEGRAL g(x) dx OVER (k,+oo)
For the rule on differentiating with respect to the upper limit of
integration, note that this is just the familiar:
[Second Fundamental Theorem of Calculus]
http://www.math.poly.edu/courses/ma1112/Antidifferentiation.pdf
The rule for differentiating with respect to a lower limit of
integration is then just a change in sign, since switching the limits
of integration reverses the sign of the definite integral.
Let us carry the differentiation one step further:
Y"(k) = y(k)
G"(k) = g(k)
and therefore in terms of the difference F(k) = Y(k) - G(k):
F"(k) = y(k) - g(k)
So the function F(k) is determined by this relationship, together with
a couple of constants of integration. For example, we might specify F
and F' at some convenient point k0 (an initial value problem for F),
or we might instead specify F at two distinct points (a boundary value
problem for F).
Our ability to specify such values for F(k) is the result of being
able to evaluate Y(k) and G(k) and their derivatives at any value of k
we like. For example:
F(k) = Y(k) - G(k)
= k * (INTEGRAL y(x) dx OVER (-oo,k) + INTEGRAL g(x) dx OVER (k,+oo))
- INTEGRAL x y(x) dx OVER (-oo,k) - INTEGRAL x g(x) dx OVER (k,+oo)
Note also that:
F'(k) = Y'(k) - G'(k)
= INTEGRAL y(x) dx OVER (-oo,k) + INTEGRAL g(x) dx OVER (k,+oo)
which is the same as the denominator in your original problem formulation.
Thus when this denominator is positive, F is strictly monotone
increasing. Nonnegativity of y(x),g(x) implies that F is weakly
monotone increasing everywhere. We are now at the point where the
assumption (*) can be exploited to show the existence and uniqueness
of root k.
Recall the individual definitions of Y(k) and G(k):
Y(k) = k * INTEGRAL y(x) dx OVER (-oo,k) - INTEGRAL x y(x) dx OVER (-oo,k)
G(k) = INTEGRAL x g(x) dx OVER (k,+oo) - k * INTEGRAL g(x) dx OVER (k,+oo)
and their derivatives:
Y'(k) = INTEGRAL y(x) dx OVER (-oo,k)
G'(k) = - INTEGRAL g(x) dx OVER (k,+oo)
The differentiability of Y(k) and G(k), together with the
nonnegativity of y(x),g(x), tell us at least that Y(k) is weakly
monotone increasing and G(k) is weakly monotone decreasing.
In order for the improper integrals to exist, it must be the case that:
0 = LIMIT Y(k) as k --> -oo
0 = LIMIT Y'(k) as k --> -oo
0 = LIMIT G(k) as k --> +oo
0 = LIMIT G'(k) as k --> +oo
This can be seen immediately in the special case that y(x),g(x) are
zero outside some bounded interval [-M,+M]. I omit the details of
argument for the general case, but would be happy to supply these on
request.
Now the continuity of y(x) and the assumption y(x0) > 0 imply that
there exists an interval containing x0 on which y(x) is strictly
positive, and so for some e0 > 0 that:
Y'(x0) = INTEGRAL y(x) dx OVER (-oo,x0)
>= e0 > 0
In fact since Y'(k) is itself weakly monotone increasing (as the
integral of y(x)), we have:
0 < Y'(x0) = e0 < Y'(k) for all k > x0
Therefore:
Y'(k) >= 0 everywhere, and
Y'(k) >= Y'(x0) > 0 for k > x0.
A similar argument shows that:
G'(k) <= 0 everywhere, and
G'(k) <= G'(x1) < 0 for k < x1.
Since G(k) is strictly monontone decreasing for k < x1 and Y(k) is
weakly increasing there, we have:
F(k) = Y(k) - G(k)
is strictly monotone increasing for k < x1. But by a symmetric
argument based on Y(k) being strictly monotone increasing for k > x0
and G(k) being weakly descreasing there, F(k) is also strictly
increasing for k > x0.
Thus, as x0 < x1, F(k) is strictly increasing on the whole real
numbers. This also shows that the denominator in the original
problem, F'(k), is always positive.
Finally as Y(k) increases without limit as k tends to +oo, and the
limit of G(k) there is zero:
+oo = LIMIT F(k) as k --> +oo
and on the other hand, as G(k) increases without limit as k tends to
-oo, where the limit of Y(k) is zero:
-oo = LIMIT F(k) as k --> -oo
Thus by a standard argument:
[Intermediate Value Theorem - PlanetMath]
http://planetmath.org/encyclopedia/IntermediateValueTheorem.html
the continuous function F must be equal to zero someplace between its
negative and positive values. Furthermore the strictly increasing
property of F guarantees us this root is unique.
It is a standard result in ordinary differential equations that the
second order ODE:
F"(k) = y(k) - g(k)
requires two additional "scalar" conditions to uniquely determine F,
much as with a single indefinite integration in which there is one
undetermined "constant of integration". See here for discussion:
[Second-Order Linear ODE]
http://oregonstate.edu/dept/math/CalculusQuestStudyGuides/ode/second/so_lin/so_lin.html
The initial conditions for F at k=0 are likely the simplest to write:
F(0) = Y(0) - G(0)
= - INTEGRAL x y(x) dx OVER (-oo,k) - INTEGRAL x g(x) dx OVER (k,+oo)
Note that with k=0, two of the terms in F(k) drop out here.
F'(0) = Y'(0) - G'(0)
= INTEGRAL y(x) dx OVER (-oo,0) + INTEGRAL g(x) dx OVER (0,+oo)
In principle we can determine initial conditions for F at any point k,
and the computational procedure suggested below assumes that it is
expeditious to choose them close to the location of the root. However
F might also be defined, analytically as well as computationally, by
specifying two boundary conditions, eg.
F(k0) = f0 ; F(k1) = f1 for distinct k0,k1
Sucb an approach might be attractive when y(x),g(x) are given
analytically, because of the potential for reusing evaluations of F
done for the purpose of bracketing the root k.
III. Examples
=============
It may be helpful to picture the solution k, not only as the root of
the function F, but rather as the point at which the curve Y, rising
from zero at the left to infinity at the right, intersects the curve
G, falling from infinity at the left to zero at the right.
In any case separately calculating Y(k), which depends only on y(x),
from G(k), which depends only on g(x), makes for a more orderly
presentation.
Let's first consider a case of two "Gaussian" functions. Because
continuous functions:
y(x) = a * e^(0.5*((x-m1)/s1)^2)
g(x) = b * e^(0.5*((x-m2)/s2)^2)
are everywhere positive on the real line, assumption (*) is
automatically satisfied.
It is well-known that the first integral of a Gaussian function, which
would be the cumulative distribution function in the case of a
probability distribution, cannot be expressed in terms of elementary
transcendental functions. Instead a "new" function erf(x), the "error
function", is introduced to express the resulting indefinite integral.
Fortunately the integral of the error function erf(x) can be
expressed rather easily in terms of itself and elementary
transcendental functions:
[Erf - Mathworld]
http://mathworld.wolfram.com/Erf.html
erf(k) = (2/sqrt(pi)) * INTEGRAL e^(-x^2) dx OVER (0,k)
INTEGRAL erf(k) dk = k erf(k) + (1/sqrt(pi)) e^(-k^2)
Note also that erf(k) is an odd function, erf(-k) = -erf(k), and that in the limit:
erf(+oo) = 1
erf(-oo) = -1
which is the rationale for introducing the funny factor of 2/sqrt(pi).
For simplicity's sake let's take function y(x) = e^(-x^2), and take
g(x) = c y(x) to be a scaled version of the same, for 0 < c < 1.
Then:
Y'(k) = INTEGRAL y(x) dx OVER (-oo,k)
= (sqrt(pi)/2) * ( erf(k) - erf(-oo) )
= (sqrt(pi)/2) * ( erf(k) + 1 )
Y(k) = k * INTEGRAL y(x) dx OVER (-oo,k) - INTEGRAL x y(x) dx OVER (-oo,k)
= k * (sqrt(pi)/2) * ( erf(k) + 1 ) - INTEGRAL x e^(-x^2) dx OVER (-oo,k)
= k * (sqrt(pi)/2) * ( erf(k) + 1 ) + (1/2) * e^(-k^2)
Futhermore:
G'(k) = - INTEGRAL g(x) dx OVER (k,+oo)
= -c * INTEGRAL y(x) dx OVER (k,+oo)
= -c * (sqrt(pi)/2) * ( erf(+oo) - erf(k) )
= -c * (sqrt(pi)/2) * ( 1 + erf(-k) )
= -c * Y'(-k)
In hindsight the above is just what we should have expected, given
g(x) = c y(x) and the "backward" nature of the range of integration
for G(k). Therefore we state without proof (details on request) the
corollary:
G(k) = c * Y(-k)
Thus the equation we have to solve for k is this:
Y(k) = G(k) = c * Y(-k)
Therefore:
k * (sqrt(pi)/2) * ( erf(k) + 1 ) + (1/2) * e^(-k^2)
= c * [ -k * (sqrt(pi)/2) * ( erf(-k) + 1 ) + (1/2) * e^(-k^2) ]
k * sqrt(pi) * [(1-c)*erf(k) + 1 + c] + (1-c) * e^(-k^2) = 0
Except for the special value c = 1, which yields immediately k = 0, no
closed form solution of this equation can be given, although its
dependence on 0 < c < 1 can be separated out as follows:
(1-c) * [k * sqrt(pi) * erf(k) + e^(-k^2)] = -k * sqrt(pi) * (1+c)
e^(-k^2) 1+c
erf(k) + ---------------- = - ----- (k < 0)
k * sqrt(pi) 1-c
Although it's fairly easy to see that the right-hand side here ranges
monotonically from -1 to -oo as c rises from 0 toward 1, it's not
quite as easy to figure out that the left-hand side ranges in the same
monotonic manner as k rises from -oo toward 0. But differentiation of
the left-hand side with respect to k yields (after simplification):
e^(-k^2)
- ----------------
k^2 * sqrt(pi)
This expression is negative everywhere it is defined, in particular
for k < 0, which implies the monotonicity hidden above. Thus
implicitly every value of c in (0,1) corresponds to a unique value of
k.
Taking c = 1/2 for the sake of definiteness, we should solve for k < 0:
e^(-k^2)
erf(k) + ---------------- = - 3
k * sqrt(pi)
Using Excel spreadsheet functions I found approximately k = -0.1951825.
What about the limiting values of c = 1 and c = 0? We already noted
above that for c = 1, which is the case that y(x) = g(x), a solution k
= 0 is obtained. This solution is valid, as we see by inspection from
the symmetry in the original problem formulation. The other limiting
case c = 0 might formally be said to have limiting solution k = -oo,
but this is not actually a valid solution because it produces division
by zero in the original problem. [See more about the lack of a
solution when g(x) identically zero in the next section.]
Let's now turn to the case when y(x) and g(x) are defined as piecewise
linear functions. If these functions are to be continuous and
integrable per the original problem statement, then they will be zero
outside some bounded interval. Within this interval the definitions
may be stated, using the terminology of interpolation theory, by
choosing distinct points {x_i} there (called "knots"):
x_0 < x_1 < . . . < x_n
and assigning values {y(x_i)}, {g(x_i)} with the convention that:
y(x_0) = g(x_0) = y(x_n) = g(x_n) = 0
and that for x_{i-1} < x < x_i, the values of y(x) and g(x) are given
by linear interpolation between their respective values at the knots
x_{i-1} and x_i.
[Note that allowing for y(x) and g(x) to be defined in terms of
separate knots doesn't actually produce any greater generality. Two
sequences of knots can always be merged ("refined") into a common
set.]
Computing a value of Y(k) or G(k) then becomes an exercise in summing
definite integrals over consecutive intervals. In the case of Y(k) we
would begin the intervals of integration with left-hand knot x_0 and
proceed by steps up to k. The case of G(k) is similar except that
we'd proceed downward from x_n to k.
It should be evident from the previous section that Y'(k) is the
indefinite integral of y(x) subject to:
Y'(x_0) = 0
and likewise that G'(k) is the indefinite integral of g(x) subject to:
G'(x_n) = 0
The process of summing consecutive integrals implicitly gives us
Y'(k),G'(k) as piecewise quadratic functions (because they are
indefinite integrals of linear functions over each required interval).
Thus Y(k) and G(k) are each going to be piecewise cubic functions.
We illustrate with a special case chosen not only for simplicity but
for ease of comparison with the results in the Gaussian example. Let
y(x) be defined over the knots
x_0 = -1 ; x_1 = 0 ; x_2 = +1
with y(0) = 1. This function is commonly referred to in numerical
methods as the "hat" or chapeau function.
Explicitly one has:
/ 0 for k <= -1
|
| 1 + x for -1 < k <= 0
y(x) = <
| 1 - x for 0 < k <= +1
|
\ 0 for k > +1
We again define g(x) = c y(x) for values 0 < c < 1.
Integrating y(x) to get Y'(k) produces this continuous piecewise quadratic:
/ 0 for k <= -1
|
| (1 + 2k + k^2)/2 for -1 < k <= 0
Y'(k) = <
| (1 + 2k - k^2)/2 for 0 < k <= +1
|
\ 1 for k > +1
and integrating once more yields our continuously differentiable piecewise cubic:
/ 0 for k <= -1
|
| (1 + 3k + 3k^2 + k^3)/6 for -1 < k <= 0
Y(k) = <
| (1 + 3k + 3k^2 - k^3)/6 for 0 < k <= +1
|
\ k for k > +1
Again the even symmetry of the functions y(x) and g(x) and the scale factor c give:
G'(k) = -c * Y'(-k)
G(k) = c * Y(-k)
Now when we are to solve:
Y(k) = G(k) = c * Y(-k)
with 0 < c < 1, the root may be seen immediately to lie -1 < k < 0.
Therefore we apply the appropriate branches of the definitions for
Y(k). Note that as 0 < -k < +1, it is necessary for G(k) to use the
right definition of Y(k), then substitute -k for k.
(1 + 3k + 3k^2 + k^3)/6 = c * (1 - 3k + 3k^2 + k^3)/6
(1-c) + 3(1+c)k + 3(1-c)k^2 + (1-c)k^3 = 0
Note that if c = 1, the equation degenerates to a linear degree and k = 0.
Taking parameter c = 1/2, we get:
k^3 + 3k^2 + 9k + 1 = 0
This cubic equation clearly has no positive real roots, and any
negative real roots must be irrational, e.g. by the rational roots
theorem:
[Rational Roots Theorem - mathwords]
http://www.mathwords.com/r/rational_root_theorem.htm
Yet because this cubic function goes from being +1 at k=0 to being -6
at k=-1, there must be (at least one) negative real root in between.
To solve this we begin by making the substitution k = t-1:
(t-1)^3 + 3(t-1)^2 + 9(t-1) + 1 = 0
t^3 + 6t - 6 = 0
which serves to clarify (since there's only one sign change) that only
one root lies in -1 < k < 0 (0 < t < 1):
[Descartes' Rule of Signs - Cut The Knot!]
http://www.cut-the-knot.org/fta/ROS2.shtml
Now Cardan's formula for this special form of a cubic (coefficient of
t^2 is zero) yields:
t = cuberoot( 3 + sqrt(17) ) + cuberoot( 3 - sqrt(17) )
k = cuberoot( 3 + sqrt(17) ) + cuberoot( 3 - sqrt(17) ) - 1
Now t is approximately 0.8846222 and k approximately -0.1153778.
Cardan (or Cardano) published a method for solving the special form of cubic:
t^3 + Mt = N
which he learned from Tartaglia ("the stammerer", ne Nicolo of
Brescia), despite promising not to publish it:
[Quadratic, cubic and quartic equations]
http://www-gap.dcs.st-and.ac.uk/~history/HistTopics/Quadratic_etc_equations.html#51
The idea is to treat t = a - b, whence:
(a - b)^3 + 3ab(a - b) = a^3 + b^3
So the problem is reduced to finding a,b such that:
3ab = M and a^3 + b^3 = N
But this can be done just by taking the cube root of a solution to a
quadratic equation, since:
b = M/(3a) implies a^3 + (M/(3a))^3 = N
implies a^6 - Na^3 + (M/3)^3 = 0
after we multiply through by a^3, and thus the usual quadratic formula
can be applied to get a^3.
In this problem it was necessary to solve:
a^6 - 6a^2 + 8 = 0, ie. N = M = 6
a^3 = 3 +/- sqrt(17)
It doesn't matter here which quadratic root is chosen, as both lead to:
t = a - b
= cuberoot( 3 + sqrt(17) ) + cuberoot( 3 - sqrt(17) )
in the end. [NB: (3 + sqrt(17))(3 - sqrt(17)) = -8]
We close this section with an observation that the limiting case c = 0
would lead to:
(1 + k)^3 = 0
which has a triple root k = -1. However this is an invalid solution
in relation to the original problem, for the denominator there would
be Y(-1) = 0, which again illustrates that we cannot allow g(x) = 0
everywhere.
IV. Counterexamples and Generalizations
=======================================
In our first counterexample consider the situation in which all the
positive support for g(x) lies "below" all the positive support for
y(x). That is, if for some z0:
y(x) = 0 for all x <= z0
g(x) = 0 for all x >= z0
then the denominator in the original problem will be zero for k = z0:
INTEGRAL y(x) dx OVER (-oo,z0) + INTEGRAL g(x) dx OVER (z0,+oo) = 0
Thus we cannot weaken our assumption (*) to allow k0 > z0 > k1,
because the "solution" k produced as a root might not only fail to be
unique (because F is no long strictly increasing), but also because
the division inherent in the original problem formulation may now be
undefined (F'(k) may be zero).
For our second counterexample we present nonnegative functions
y(x),g(x) such that the denominator is always positive, yet no
solution exists even to the rewritten version of the problem. Take
y(x) = e^x and g(x) = 0.
The right-hand side of the original problem then evaluates to:
(k-1)e^k / e^k = (k-1)
and clearly no value of k can satisfy k = k-1.
This is just an easily verified case of a systematic problem when g(x)
= 0 for all x. For then the ratio:
INTEGRAL x y(x) dx OVER (-oo,k) + INTEGRAL x g(x) dx OVER (k,+oo)
-------------------------------------------------------------------
INTEGRAL y(x) dx OVER (-oo,k) + INTEGRAL g(x) dx OVER (k,+oo)
simplifies to the expected value of x weighted by a probability
distribution derived from the restriction of y to the range (-oo,k).
Unless all of the "weight" were concentrated at x = k (which cannot
occur with a "continuous" measure), the expected value of x must fall
below the upper threshold k. Thus equality of the ratio with k is
never possible in this case.
Finally we note that continuity of y(x) and g(x) is not essential to
the analysis, although we made expedient use of this condition in
simplifying assumption (*) to be a statement about particular function
values y(x0), g(x1) > 0. When there is no continuity assumption, it
would be necessary to restate the condition as involving a positive
integral for y on some interval lying below one where g has a positive
integral. Despite the fact that y,g are not continuous, their
integrability implies continuity of F' and hence of F.
V. Computational Outline
========================
For the purpose of discussing a software implementation, we'll assume
that y(x) and g(x) are input as continuous piecewise linear functions,
using three related arrays: one for the "knots" (interpolation points)
in the sense discussed under the Examples section, and one apiece for
the respective values of functions y and g at those knots.
An efficient implementation to find k such that Y(k) = G(k) must avoid
evaluating the integrals that define these functions unnecessarily.
To reach Y(k) we must evaluate at least once the integrals over all
the intervals that lie to the left (below) k, and similarly for G(k)
all the intervals to the right (above) k.
Yet when k is unknown, can we guarantee to do this and no more? A
strategy that accomplishes this is to evaluate the function Y starting
at Y(x_0) = 0 and function G starting at G(x_n) = 0. We then proceed
to extend the evaluation of Y one interval from the left or G one
interval from the right depending on whether at the current "knots":
Y(x_i) < G(x_j) OR Y(x_i) > G(x_j)
If the former is the case, then we should extend the definition of Y
over interval [x_i,x_{i+1}]. If the latter is the case, then we
should extend the definition of G over interval [x_{j-1},x_j].
In the case of a tie, which will of course occur at least once (at the
outset), we could impose an arbitrary rule for which side to extend
from. Since the functions are both strictly monotone increasing once
we reach their "support" regions, the tie cannot be sustained
indefinitely. It might be attractive to state the rule for ties as
extending from both sides, as this would retain a symmetric
disposition.
In any case the strategy efficiently narrows down the interval which
contains root k, for in that interval we will have G greater than Y at
the left-hand endpoint and Y greater than G at the right-hand
endpoint.
Once this single interval is determined, and the local polynomials for
Y and G determined there through their coefficients, it will be
simplest to code a Newton iteration to identify the root k. For one
thing the evaluation of the derivatives will already have been
"subsidized" by the computation of function values for Y and G. Also
the derivatives cannot vanish (the functions are strictly monotonic in
the region of interest), but higher order coefficients may. So a
symbolic solution would need to account for degenerate polynomial
degrees in a way that the Newton method would not.
Given the non-vanishing (nonzero) derivatives, the accuracy of a
Newton approximation (number of places of precision) will roughly
double with each iteration (quadratic convergence). So there seems to
be every advantage to hanging our hats on this one approach to root
finding in the present case.
A word or two is in order to clarify what is meant by extending the
definition of Y (or G) across an interval. It is a matter of a double
integration, ie. a simple sort of second-order ODE, given the
equation:
Y"(k) = y(k)
over the interval [x_i,x_{i+1}], and the inital conditions at the
left-hand endpoint known by agreement at the "knot" from the previous
interval:
Y(x_i), Y'(x_i)
obtained by evaluating the previous cubic polynomial at this "new"
boundary. These values then enter the computation as "constants of
integration" for the two integrations of y(k):
Y(k) = INTEGRAL [ INTEGRAL y(x) dx OVER (x_i,t) ] dt OVER (x_i,k)
+ Y'(x_i) * (k - x_i) + Y(x_i)
A similar expression could be stated for G(k), emphasizing that the
only essential difference is the direction in which the intervals of
integration proceed (from right to left instead of left to right).
VI. Summary
===========
In the Cartesian method one assumes the existence of a solution to the
problem and follows the implications of that assumption to see where
it leads. If at the end an explicit form of solution is obtained, it
is necessary to go back and verify that the putative solution is
actually one.
Such an approach underlies the analysis above. The relationship F"(x)
= y(x) - g(x) can be justified in a generalized sense even when
continuity of y and g is replaced by continuity of the respective
integrals. What is essential to the existence of a solution is
showing that F is continuous and has both negative and positive
values. Uniqueness of the solution is shown from the strict
monotonicity of F. We were able to deduce existence and uniqueness
from the additional assumption:
(*) there exist real numbers x0 < x1 such that y(x0),g(x1) > 0
When y and g are given in symbolic form, it is natural to attempt to
express F in symbolic form as well. Unfortunately, as our discussion
of the gaussian case showed, not every elementary transcendental
function has an integral that can be expressed in terms of elementary
transcendental functions. Even if a function can be stated in terms of
elementary transcendental functions, there is not usually any way to
express its roots "in closed form".
For a simple example, consider x + e^x = 0.
Computational methods are not necessarily to be rejected even when
theoretical analysis can lead to closed form solution, as with the
case of piecewise linear y(x),g(x). Cardan's formula for the roots of
a cubic equation, though it meets the requirement of an expression in
terms of radicals based on the coefficients of the polynomial, is more
difficult to program and not more accurate than a Newton iteration.
When functions y(x) and g(x) are given in "sampled" form, or otherwise
in piecewise fashion, it is necessary to carry out integrations (to
find initial conditions) in "steps" by the intervals over which their
definitions are revised.
It is proposed to do this for y(x) from the left, ie. on (-oo,k0), and
for g(x) from the right, ie. on (k1,+oo), to obtain values Y(k0) and
G(k1) respectively. Since we are interested in the situation Y(k) =
G(k), and since Y is increasing indefinitely from 0 as k increases
while G does the same as k decreases, we suggest the rule:
if Y(k0) < G(k1), then increase k0 over the next y-subinterval
if Y(k0) > G(k1), then decrease k1 over the next g-subinterval
Such an approach avoids unneeded "step" integrations required to
isolate the intersection of Y(k) and G(k).
regards, mathtalk-ga |