Google Answers Logo
View Question
 
Q: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f ( Answered,   15 Comments )
Question  
Subject: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
Category: Science > Math
Asked by: johnswanstone-ga
List Price: $100.00
Posted: 31 Dec 2004 22:30 PST
Expires: 30 Jan 2005 22:30 PST
Question ID: 449910
An algebraic solution is preferred as I'm going to have to solve this
problem hundreds of thousands of times with different coefficients.

Here are the coefficients:

a = -4*n^4*(p^2+q^2)
b = 4*n^2*p*(m^2*p^2+n^2*p^2+m^2*q^2+n^2*q^2+n^2*r^2)
c = -m^4*p^4-n^4*p^4-2*m^2*n^2*p^4-m^4*q^4-n^4*q^4+
     2*m^2*n^2*q^4-2*m^4*p^2*q^2-2*n^4*p^2*q^2-n^4*r^4-
     2*n^4*p^2*r^2-2*m^2*n^2*p^2*r^2+2*n^4*q^2*r^2+
     2*m^2*n^2*q^2*r^2
d = -4*m^2*n^2*p*r^2*(p^2+2*q^2)
e = 2*m^2*p^2*r^2*(m^2*p^2+n^2*p^2+m^2*q^2+n^2*q^2+n^2*r^2)
f = -m^4*p^4*r^4

where m,n,p,q,r are arbitrary real number constants subject to the
following constraints:

n>0
m>0
r>0
-r<p<r
-r<q<r
p^2+q^2<r^2

Each time I solve this p and q will be different.  r,n,m will change
only every hundred thousand solutions or so.  This is the reason I'm looking for
an exact solution.  Numerical methods will be too slow for the number
of times I need to solve this.

Alternately, each n,m,p,q,r I choose will reduce the coefficients to
specific numbers.  Then in my symbolic math software, I can use a
"factor" command to solve each particular polynomial.  I don't know
what's going on behind the scenes, or if it is fast enough to do a
hundred thousand of them in say 3 to 5 seconds on a modern PC, but if
it is, then a description and implementation of the ?factor? algorithm
in C or C++ would be acceptable as a solution to this problem.

For those of you who are curious I will describe the origin of this problem:

The problem I'm working on involves performing refraction distortion
correction of the path of a laser beam as it passes through a human
eye. The air/eye boundary is modeled in my problem as a hemisphere. 
In actuality it will be an arc of some radius and some extent which
will be determined by the data. see:
http://answers.google.com/answers/threadview?id=449830 The device can
sample data at 256 different points (give or take)
in time for each beam shot into the eye (using optical coherence
tomography).  This process is repeated 512 times (give or take), each
time with the laser beam translated by a few 10's of micro-meters
across the eye. So I have this grid of data say 256 points deep and
512 points across.  My job is create an image of the interior of the
eye from this data.  To do this I place a rectangular grid across the
eye and then for each point <p,q> in the grid I ask what is the <x,y>
value on the surface of the eye that would cause the laser beam to
reflect into <p,q>.  If I get lucky this will correspond to a single
column in my data grid.  Now using the distance between <p,q> and
<x,y> I can find which data sample that corresponds to that <p,q> in
the eye.  If I'm unlucky (which I will be most of the time) I can find
the pair of data columns that bracket the <x,y> point and interpolate
the number I'm looking for from there.

Ok, now here?s the analysis I?ve done to date:

Let?s start with a hemisphere of radius r centered at the origin and
extending from ?r to r on the x-axis and from 0 to r on the y-axis. 
Let?s assume that everything outside the hemisphere has an index of
refraction n and everything inside has an index of refraction m. 
Let?s pick any point <x,y> on the hemisphere and any point <p,q>
inside the hemisphere.

The time it takes light to travel from the point <x,r> to <x,y> is: n/c * (r-y)  
The time it takes light to travel from <x,y> to <p,q> is: m/c *
sqrt((x-p)^2 + (y-q)^2))
And the total time is the sum of these two terms.
Knowing that x^2+y^2=r^2 let?s us substitute and eliminate x in the
prior equation.  Now applying Fermat?s Principle, which is dt/dy = 0
let?s us differentiate the substituted equation of time with respect
to y and set it?s value to 0.  A bit of algebra leads us to the
following equation:

m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) 

Squaring both sides and a bit of algebra leads to:

[(m^2*p^2)/n^2]* y^2/x^2 ? 
[2*(m^2*p*q)/n^2] * y/x +
[2*p] * x + 
[2*q] * y + 
[(m^2*q^2)/n^2] - p^2 - q^2 - r^2 = 0

We still know that:

x^2 + y^2 = r^2

So we have 2 equation and 2 unknowns.  Substituting y = sqrt(r^2 ?
x^2) and squaring both sides again and quite a bit of algebra leads to
the original 6th degree polynomial.

For those you who are adventurous: Continuing with the original
problem description ?

Next I have to repeat this process for each boundary within the eye. 
The problem is that all boundaries other than the first need to be
represented by splines (which are piece-wise polynomials with the
property of being differential in the 2nd degree at the piece-wise
boundaries).  My thoughts are that I could solve for <x,y> on each
polynomial similarly to what I?ve describe above for the hemisphere
and exclude the solution if it is not in the "piece" of the spline
that polynomial defines.  Any thoughts?  Suggestions?

Ok.  Thanks for getting this far. I hope to hear from you soon.

J.

Clarification of Question by johnswanstone-ga on 31 Dec 2004 22:33 PST
Where I say:  

"I ask what is the <x,y> value on the surface of the eye that would
cause the laser beam to reflect into <p,q>."

I really mean:

"I ask what is the <x,y> value on the surface of the eye that would
cause the laser beam to refract into <p,q>."

Request for Question Clarification by mathtalk-ga on 01 Jan 2005 14:05 PST
Hi, johnswanstone-ga:

A sixth degree polynomial has six roots in the complex plane counting
multiplicity.  However your application would suggest you are
interested in only one (real) solution for x.  If this is the case,
how do you distinguish the one real root which is of interest from
those which are "artifacts"?

From your description the parameters p,q vary constantly and
parameters r,m,n vary seldom.  If the parameters p,q vary by slight
amounts from a previous "setting", then it is especially attractive to
use a numerical solution using the prior solution as an initial
approximation.  I believe that if you are only interested in a single
real solution which can be succinctly identified (per my first
paragraph) then a numerical solution will likely outperform any
"exact" symbolic solution in terms of speed.  Indeed, because your
problem appears to have five independent parameters, it is unlikely
that your general case is more special than the general sixth-order
polynomial, whose "symbolic" solution requires operations that provide
solutions to these problems:

  x^6 + x = A
  x^5 + x = B

along with solutions to dozens of polynomial "subproblems".

Again, your description of the underlying application makes it sound
as though for given values of r,m,n, you will vary p,q within the
circular domain:

  p^2 + q^2 < r^2

to solve for corresponding x,y values.  The physical interpretation
suggests that x,y will vary continuously with p,q (while r,m,n are
fixed), and thus that with the possible exception of the first
solution, an excellent initial "guess" for the desired solution is
available as the result of a previous problem.  In the absence of a
"multiple" root, a numerical method such as Newton's will converge
very rapidly starting from a sufficiently accurate initial
approximation.

One further request for clarification:  Is it possible that degree of
the polynomial formulation would be lower if both x,y were retained in
the equation(s)?

However if you are truly interested in knowing just what an "exact"
solution to the stated problem looks like, then I will be happy to
stick with that approach.

regards, mathtalk-ga

Clarification of Question by johnswanstone-ga on 01 Jan 2005 15:38 PST
mathtalk,

Thanks for your reply.  

It is true that I'm only looking for a single real solution to this
problem.  The rest of the solutions are artifacts caused by the
"squaring of both sides" operations I did (at least this is what I
think it is caused by).

I agree that <x,y> will be continious with changing <p,q> and I agree
with your analysis that a numerical solution could be faster for
finding <p+delta, q+delta> from a given <p,q> <x,y> solution.  The
problem is ... how do you find the initial solution?  And I'd really
like to be able to compare the numerical to the algebriac solution.  I
also would like to know how to find the algebriac solution for my own
education.

Discovering which root is the "correct" one for my problem is simple
enough.  Just see which root value makes following equation true:

m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2)


For example, setting:

n=1
m=1.5
r=1
q=0.1
p=0.5

yields the following four real roots:

x=-0.55208096577064258 and y=0.83379050560302892
x=-0.44209591096428252 and y=0.89696778398594745
x=0.68024533659318476 and y=0.73298450327631404
x=0.75808380947988874 and y=0.65215714195618513

plugging them into the above equation shows that x=0.68024533659318476
is the one that I'm really looking for.  All the others don't balance
the equation (which I find to be pretty strange.  Maybe you can expain
it).

John.

Clarification of Question by johnswanstone-ga on 03 Jan 2005 06:17 PST
mathtalk,

Are you working on this?  Is there any progress to report?  Should I
look elsewhere for an answer? hello, is there anybody out there?

john.

Request for Question Clarification by mathtalk-ga on 03 Jan 2005 08:59 PST
Hi, johnswanstone-ga:

I am working on your problem, specifically to find solutions of the
nonpolynomial system directly.

Looking at the "original" formulation:

m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) 

  x^2 + y^2 = r^2

I'm wondering what "solution" you desired for p = q = 0 ?

Is this an exceptional case, for which no solution is possible?   If
I'm parsing the equation correctly, it amounts to 0 = nr.  However for
certain values of m,n there could be a "limiting" solution x = 0, y =
r.

regards, mathtalk-ga

Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:22 PST
Hi, johnswanstone-ga:

When you wrote, "The equation of a circle in terms of an angle, a:

  x = r*sin(a)
  y = r*sin(b)"

did you mean x = r*cos(a), y = r*sin(a), so that x^2 + y^2 = r^2 ?

regards, mathtalk-ga

Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:27 PST
Never mind, I see from the substitution step that you are taking:

x = r*sin(a)

y = r*cos(a)

which makes your angle a the "complement" of my angle t.

regards, mathtalk-ga

Request for Question Clarification by mathtalk-ga on 04 Jan 2005 19:50 PST
Hi, John:

I'm having some doubt about how you've applied Fermat's principle.  We
can agree that to find the path between two fixed points, Fermat's
principle selects the minimum travel time.

However in your application the path from (x,r) to (p,q) only involves
one fixed point, namely (p,q).  The point (x,r) depends on the
variable y.

I'm trying to reconcile the version obtained from Snell's Law with the
"original" formulation.  I'll feel more confident in coding the
solution if I know these are equivalent.

regards, mathtalk-ga

Clarification of Question by johnswanstone-ga on 05 Jan 2005 12:08 PST
Mathtalk,

Given a point <p,q> and applying Fermat's Principle leads to
minimization of a function in time which leads to the roots of a 6th
order polynomial expressed in either x or y (each having different
coefficients of course).

Once either x or y is found the other can be immediatly calculated by
x^2 + y^2 = r^2 and once y is known <r-y> is also known since r is a
constant (i.e. input variable).

In my example:

n=1
m=1.5
r=1
q=0.1
p=0.5

the solution is:

x=0.68024533659318476 and y=0.73298450327631404

which can be verified by:

solving for a using:

x = r * sin(a) 

which gives:

a = 0.74809729181091122

and b using:

b = arcsin( (n/m) * sin(a) )

which gives:

b = 0.47068498620621618

and finally verifying this equation is true:

(x-p) = tan(a - b) * (y-q)

which gives:

0.18024533659318487 = 0.18024533659318503

John.

Request for Question Clarification by mathtalk-ga on 07 Jan 2005 06:24 PST
I see that for your sample problem the Fermat principle and Snell's
Law approach agree on the solution, so I'm assuming that there's an
equivalence there.  But I've been unable to prove it.

On another practical note, what is the angular range of the set of
(p,q) points you will need to solve "at one blow"?  I'm guessing that
a fully dilated pupil/iris will be less than a 30 degree aperture, but
that from your description your field of illumination is intended to
be quite a bit smaller than this.

So the question is whether we are trying to quickly find 256x512
solutions whose angular range is more like 10 degrees or like 1
degree?  Or even less?

regards, mathtalk-ga

Clarification of Question by johnswanstone-ga on 07 Jan 2005 13:28 PST
Mathtalk,

I'm not really sure of what angular field my data encompasses since I
haven't actually fit a circle to it and measured the angle.  But if
you're curious, I can show you a raw data image.

http://www.frontiernet.net/~jfss/JAOSnormal995bk.%20%20%202.BMP

John.

Request for Question Clarification by mathtalk-ga on 08 Jan 2005 08:25 PST
Hi, John:

In the image shown the angular measurement of the outer circular edge
is about 67.5 degrees.  My question has to do with the "patch" of p,q
coordinates for which you want a split-second computation.  Since you
described these points as being a 256x512 point array inside the
radius of the outer circle, I thought they were perhaps a localized
part of the interior, not extending over the entire image.

regards, mathtalk-ga

Clarification of Question by johnswanstone-ga on 09 Jan 2005 21:12 PST
Mathtalk,

Well, My job is to perform refraction correction on the source image.
So the first thing I need to do is correct the refraction distortion
caused by the initial air/eye boundary.  This will correct all the
data inside the cornea (assuming the eye has a uniform index of
refraction) and then perform the correction again for the back surface
of the cornea using the results of the first correction as input.

The 256x512 "patch" was a rough approximation for the number of pixels
in the image I need to correct.

I understand that since the refraction of the light rays will vary
continuously with the distance from the center of the "circle" that
solutions for <p,q> points that are "close" to each other will also be
"close" to each other and that provides for very fast convergence of a
numerical root finding method.  What I don't know how to do is find
the initial solution to start the whole thing off.  How do you plan on
doing that?

John.

Request for Question Clarification by mathtalk-ga on 10 Jan 2005 20:32 PST
Hi, John:

In my experiments convergence was quick even if x = 0 was used as the
starting approximation.  However I don't recommend that because it is
possible, especially if q < 0, for the first step of the method to
"overshoot" the boundary x = 1, so it's prudent to be a little more
careful in arranging the computation.

Here's one approach that I think will work nicely.  Take a closely
spaced "horizontal" line of points (p,q) sharing a common value of q. 
Start with the point where |p| is closest to 0 and work outward, using
x = p for the first point's initial guess and the converged result of
each point as the starting value for the next point.  Since the exact
solution is x = 0 when p = 0, x = p is an excellent starting value
close to the y-axis.

Alternatively one could take a "vertical" line of points and start
from the top (closest to the circle), also using x = p as the first
point's initial guess.  Since the refraction near the "lens surface"
is slight, again this is an excellent approximation.

From what I've tried I'd say the Fermat principle iteration works best
when your intial guesses underestimate x, and the Snell's Law
iteration works best when the initial guesses overestimate x.  The
suggestions above for x = p are of course underestimates (assuming n/m
< 1).  The only thing I'd be worried about in recommending solely
relying on the Fermat formulation is if for some cases you need to
work with |x| close to 1.  However, given the application parameters
discussed earlier (an angular range probably within +45 to -45
degrees), I don't think this is really an issue.

My rough operation count for each Newton iteration is:

  10 multiplications
  11 additions
   3 divisions
   2 square roots

With a bit of thought one or two of the multiplications per iteration
can probably be "amortized" over the convergence sequence, but since
this is at worst two or three iterations with starting values of
modest accuracy, there's not a whole lot to save.

If you need to solve this for every pixel more or less, even a single
Newton iteration at each point is probably suboptimal.  The Newton
iterations can be done at more widely spaced points, filling in
between them with either a linear or an inverse quadratic
interpolation.

If your hardware platform has a floating point unit, the ultimate
performance will be eked out by "coding to the metal" and assigning
floating point registers to hold specific intermediate computations,
so that all or most of them involve only register operands.

regards, mathtalk-ga

Request for Question Clarification by mathtalk-ga on 21 Jan 2005 20:35 PST
Hi, johnswanstone-ga:

Using MAXIMA and PARI/GP, I calculated the Galois group of the sixth
degree polynomial for x which results from the parameters:

  p = 0.5, q = 0.1, n = 1, m = 1.5, r = 1

as in the example you worked with previously.

After scaling one obtains a monic irreducible polynomial:

x^6 - 738*x^5 + 135226*x^4 + 10513152*x^3 - 3735673344*x^2 + 24637221273600

whose roots are an integer multiple 208 times the roots of your original problem.

The Galois group is S_6, the full symmetric group on the six roots. 
Consequently the root x cannot be expressed in terms of only roots and
rational expressions based on the integers.

On a more positive note, I can provide a numerical method which
converges quickly and robustly for every (p,q) in the upper
semi-circle:

  p^2 + q^2 < r^2

          q > 0

implemented in C (for example), subject to the assumption n/m < 1.

regards, mathtalk-ga
Answer  
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
Answered By: mathtalk-ga on 30 Jan 2005 22:10 PST
 
Hi, John:

As you've outlined, a vertical beam of light strikes a point on an
upper semi-circle:

  x + y = r, y > 0

where the index of refraction changes from n above the semi-circle
to m below.  According to Snell's Law, the light will deflect from
a straight path toward the normal line when n < m and away from it
when n > m.

Given your application of correcting images of the eye, as light
passes from air (index of refraction ~ 1) to the cornea, lens, and
aqueous media beyond (index of refraction > 1), we will primarily
treat the case where n < m.

We will refer to ratio k = n/m rather than to the separate indices.

By choosing units of length the radius of the semi-circle can be
assumed r = 1.  Applying Fermat's principle you obtained for point
(p,q) within the upper semi-circle:

  py - qx              
  -------  =  k SQRT((p-x) + (q-y))
     x

which we wish to solve for (x,y).

Polynomial equation and extra roots
-----------------------------------

In deriving a polynomial from this and the equation x + y = 1, a
number of "artifact" roots are introduced, ie. values x satisfying
that polynomial formulation but not the original one.  These extra
roots are added by steps that multiply both sides by nonconstant
polynomials or that square both sides of an equation.  For example
the equation x = 1 becomes x = 1 when both sides are squared and
acquires a "false root" x = -1 in the process.

Nonetheless the desired root x is among those which satisfy:

  ax^6 + bx^5 + cx^4 + dx^3 + ex^2 + f = 0

where:

  a = 4(p + q)(k)

  b = -4((k + 1)(p + q + 1) - 1)kp

  c = (k - 1)(q)
    + 2(((k) + 1)*(p - 1) - (k - 1))q
    + ((k + 1)p) + 2(k + 1)kp + (k)

  d = 4kp(p + 2q)

  e = -2((k + 1)(p + q + 1) - 1)p

  f = (p)

Note that while both odd and even powers of p appear among these
coefficients, only even powers of q are present.  This explains some
of the artifact roots.  Where a solution x for (p,q) in the upper
semi-circle exists, another solution x for mirror-image point (p,-q)
in the lower semi-circle may also exist.

This accounts for an extra root which exists for the sample data:

  p = 0.5
  q = 0.1
  k = (1.0/1.5) = 2/3

While the refracted path through the point on the unit circle:

  x = 0.68024533659318476
  y = 0.73298450327631404

passes through the point (p,q), this artifact solution:

  x = 0.75808380947988874
  y = 0.65215714195618513

passes through (p,-q).

Similarly k only appears in even powers in the polynomial, and thus
the other two real "solutions" actually satisfy for (p,q) or (p,-q):

  py - qx              
  -------  =  -k SQRT((p-x) + (q-y))
     x

It is easy to see that these last artifact roots were introduced by
squaring both sides of the Fermat's principle formulation.

Impossibility of solution by radicals
-------------------------------------

Earlier in an RFC I reported the result of using MAXIMA and PARI/GP to
determine if the particular sixth-degree polynomials produced in this
application can be solved by radicals, ie. by algebraic formulas that
require roots and fields operations acting on the coefficients of the
polynomial.

The result was that it is impossible to do so for the particular case
p = 0.5, q = 0.1, k = 2/3 which you had used, and therefore impossible
in general for these sixth-degree polynomials.

However we did note that for the case q = 0, the resulting polynomial
for x is only cubic and thus can be solved by radicals.

Snell's Law formulation
-----------------------

A third formulation of the problem is possible using Snell's Law.
By a bit of trigonometry:

   p-x
  ----- = tan(a - b)
   q-y

where a is the angle of vertical to the normal on the circle at (x,y)
and b is the angle of refracted path to the same normal.

Snell's Law says that:

  n * sin(a) = m * sin(b)

and from the relations sin(a) = x, cos(a) = y, plus:

  sin(b) = kx

  cos(b) = SQRT(1 - kx)

one can rewrite tan(a - b) = sin(a - b)/cos(a - b) in terms of (x,y):

  p - x         (1 - k)x
  ----- = ---------------------
  q - y   y + SQRT(k - (kx))

If we think about both sides as a difference quotient (slope formula)
involving (x,y) and one other point, this can be interpreted as saying
that (p,q) lies on the straight line connecting the two points:

  (x,y) and (kx, -SQRT(k - (kx)))

Note that while (x,y) lies on the upper semi-circle, the latter point
falls into the lower half-plane because of the negative y-coordinate.

It's possible then to determine for (x,y) on the upper semi-circle the
y-intercept of the refracted path is (0,-f), where:

  f = ( ky + SQRT(k - (kx)) )/(1 - k)

This "focal" f has the limit of k/(1-k) as points (x,y) tend to (0,1)
and another limit k/SQRT(1 - k) as (x,y) tends to (1,0) or (-1,0).
Any point (x,y) on the upper semi-circle will have its refracted path
through (0,-f) for some f intermediate between these limits.

Newton iterations
-----------------

Given a differentiable function F(x) and an initial guess x for a root
of F(x) = 0, we compute successive Newton approximations:

  x  :=   x  -  F(x)/F'(x)

Any of the three formulations could be used as the basis for a Newton
iteration.  The polynomial form would make it easy in some respects to
code the computation of the polynomial and its derivative together in a
common loop over the coefficients.  However the polynomial form has two
drawbacks in this application.  It has artifact roots that we'd need to
be on guard against, and its coefficients require a greater amount of
calculation from (p,q) each time they change than other formulations.

The subject of starting values will be taken up in the next section.

For the moment we will concern ourselves with the calculations needed
for the Newton iteration itself.

The Fermat principle can be reformulated in a variety of ways, even
into the Snell's law form (and vice versa).  So we don't claim that
the particular choice of equation made here is a necessary one.

Clearing denominators and moving all terms to one side produces:

  F(x) = py - (q + kd)x = 0

where d = SQRT((p-x) + (q-y)).  Differentiating with respect to x:

  F'(x) = py' - (q + kd) - kxd'

where y' = -x/y and d' = -(p - x + (q - y)y')/d.

Alternatively we can use the Snell's law formulation in a similar way,
clearing the denominators and grouping all terms on one side:

  G(x) = (p - x)(y + z) - (1 - k)(q - y)x = 0

where z = SQRT(k - (kx)).  Differentiating with respect to x:

  G'(x) = (p - x)(y' + z') - (y + z) - (1 - k)(q - y - y'x)

where y' is as above and z' = -(k)x/z.

Starting values
---------------

Once a Newton iteration is available for a problem that requires modest
computation for an exact derivative, as here, the main difficulty left
is to settle upon initial guesses that are good enough to allow the 
asymptotic quadratic-convergence to quickly appear.

Here we have a couple of reasonable starting values that bracket the
true root which are nearly "free" in terms of computation required.

The point (p,SQRT(1 - p)) would lie on the straight path through (p,q)
if k were 1, ie. if there were no refraction.

On the other hand ( p/SQRT(p+q), q/SQRT(p+q) ) would lie on the line
through (p,q) and the origin, which is the limiting path (the normal line)
if k were 0.

It is easy to show that the functions F(x) and G(x) both have a change
in sign between these two points on the upper semi-circle, thus proving
that a root exists between them in each case.

A simple linear interpolation between these values may be the best choice,
certainly in the case that (p,q) is close to either the semi-circle or to
the y-axis.

One can construct narrower bounds using a different pair of lines passing
through (p,q), at a cost of some additional computation.  For example, if
we take the limiting points (0,-f) found earlier:

  f = k/(1 - k) and k/SQRT(1 - k)

then lines from these two y-intercepts through (p,q) will intersect the
unit semi-circle in a pair of points that more tightly bracket the root
we want.

Another thought here is that instead of using linear interpolation between
the respective values of F(x) or G(x) at such a bracketing pair to obtain
an initial guess for the Newton iteration, we can fairly easily find the
intersection of the two refracted paths from those two points in the lower
half-plane, then project from that "false focus" through (p,q) to a "guess"
on the unit semi-circle.


regards, mathtalk-ga

Request for Answer Clarification by johnswanstone-ga on 01 Mar 2005 14:24 PST
mathtalk,

You said:

On a more positive note, I can provide a numerical method which
converges quickly and robustly for every (p,q) in the upper
semi-circle:

  p^2 + q^2 < r^2

          q > 0

implemented in C (for example), subject to the assumption n/m < 1.

regards, mathtalk-ga

So where is this code?

John.

Clarification of Answer by mathtalk-ga on 02 Mar 2005 20:46 PST
Hi, John:

Quite right, I promised C code and I'll deliver it.  

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 02 Mar 2005 22:27 PST
For starters here's some basic C code to run the Newton iterations for
both the Fermat and Snell's Law formulations.  I picked a fixed error
tolerance, but if it were desirable we could make that a parameter for
both routines.  When p = 0, the explicit answer x = 0 is known, so we
simply return that.

For both routines p,q are the coordinates of the point inside the
circle, normalized by dividing them by the circle's radius as
discussed previously.  In the case of Fermat Principle formulation, k
= n/m.  For Snell's Law we supply k2 as k for convenience.  In both
cases parameter x is a suitable initial guess.

Next I provide a test harness that sets initial values and determines
the speed with which each formulation can solve the equation.

#include <math.h>

double newton_fermat( double p, double q, double k, double x)
{
	double y, dydx, d, dddx, f, dfdx, xd;
	double tol = 1.e-7;   /* tolerance could be a parameter */

	if (p == 0.) return 0.0;

	/*
	    F(x) = py - (q + kd)x = 0
	    where y = SQRT(1 - x) and d = SQRT((p-x) + (q-y))

	    F'(x) = py' - (q + kd) - kxd'
	    where y' = -x/y and d' = -(p - x + (q-y)y')/d
	*/
	while(1)
	{
		y = sqrt(1. - x*x);
		d = sqrt((p-x)*(p-x) + (q-y)*(q-y));
		f = p*y - (q + k*d)*x;

		dydx = -x/y;
		dddx = -(p - x + (q-y)*dydx)/d;
		dfdx = p*dydx - (q + k*d) - k*x*dddx;

		xd = -f/dfdx;
		if (fabs(xd) < tol) return x + xd;
		x += xd;
	}
}

double newton_snell( double p, double q, double k2, double x)
{
	double y, dydx, z, dzdx, g, dgdx, xd;
	double tol = 1.e-7;   /* tolerance could be a parameter */

	if (p == 0.) return 0.0;

	/*
	    G(x) = (p-x)(y+z) - (1-k)(q-y)x = 0
	    where y = SQRT(1 - x) and z = SQRT(k - (kx))

	    G'(x) = (p-x)(y'+z') - (y+z) - (1-k)(q - y - y'x)
	    where y' = -x/y and z' = -(k)x/z
	*/
	while(1)
	{
		y = sqrt(1. - x*x);
		z = sqrt(k2 - (k2*x)*(k2*x));
		g = (p-x)*(y+z) - (1-k2)*(q-y)*x;

		dydx = -x/y;
		dzdx = -k2*k2*x/z;
		dgdx = (p-x)*(dydx + dzdx) - (y+z) - (1-k2)*(q - y - dydx*x);

		xd = -g/dgdx;
		if (fabs(xd) < tol) return x + xd;
		x += xd;
	}
}

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:37 PST
Before writing the C code to construct an initial guess for Newton's
method, let me present the formulas for some of the underlying
concepts.

We kicked around the idea before of an "approximate" focus as light
paths refracted by the upper semi-circle intersect the y-axis.  Using
the Snell's Law formulation, it is possible to express this
y-intercept (0,-f) either in terms of x or y:

  (p - x)(y + SQRT(k - (kx))) - (1 - k)(q - y)x = 0

Since this equation holds for every point (p,q) on the refracted path
from (x,y), we can let p approach zero and q approach -f, and after
factoring out x:

  (y + SQRT(k - (kx))) + (1 - k)(f + y) = 0

A bit of algebra yields:

           ky + SQRT(k - (kx))
    f  =  ------------------------
                 1 - k

It is convenient to introduce constant C = 1/k - 1, allowing us to write:

    f  =  (y + SQRT(y + C))/C

Furthermore this relation can be inverted, giving y in terms of f:

    y  =  (C*f - 1/f)/2

We should bear in mind that y,f > 0 in the context of our problem
(having chosen (0,-f) as the notation for the y-intercept because it
falls below the x-axis).

Also the sign of x =  SQRT(1 - y) is chosen to agree with the sign of p.

Given that we can easily translate between locating a point (x,y) on
the upper semi-circle and the corresponding "focus" (0,-f) on the
lower y-axis, a more flexible approach to finding an initial guess may
be sought.  First note that if we have points (x_a,y_a) and (x_b,y_b)
that "bracket" the solution (x,y) corresponding to (p,q) on the same
side of the y-axis as (p,q), then their "foci" (0,-f_a) & (0,-f_b)
will bracket the "true" intercept (0,-f).  That is:

  if p > 0,   0 < x_a < x < x_b  <==>  0 < f_b < f < f_a

Earlier we discussed one choice of "bracketing" points:

     (x_a,y_a)  =  (p, SQRT(1 - p))

     (x_b,y_b)  =  (p/SQRT(p+q),q/SQRT(p+q))

These are respectively the results of drawing a vertical line and a
radial line through (p,q) and finding the intersection with the upper
semi-circle.

Alternatively we can find where the line from some point (0,-s) on the
lower y-axis through (p,q) intersects the upper semi-circle.  If s =
f, we get the exact solution (x,y) in this way.  Again some formulas
are needed.

Since the slope of this line is m = (q+s)/p, we have to solve:

  y = mx - s  and x + y = 1

being careful to distinguish which quadratic root is on the upper semi-circle.

The answer turns out to be:

  x = (ms  SQRT(m - s + 1))/(m + 1)

with the sign  chosen to agree with p.

Instead of the vertical and radial lines, which would correspond to
choosing s to be +oo or zero, we might use:

  s_0 = k/(1-k)  and s_1 = k/SQRT(1 - k)

which represent the limiting extremes of the y-intercepts.  I.e.,
refracted paths which start close to the y-axis, x = 0, will intersect
the y-axis slightly above (0,-s_0), while paths that start nearly at
the endpoints of the upper semi-circle are refracted more sharply and
intersect slightly below (0,-s_1).

For the sake of illustration, here is how one might construct an
improved initial guess using these ideas.  The Snell's Law formulation
can be restated:

  G(x) = g(s) = p(s+y) - (s+q)x = 0

where y = (C*s - 1/s) and x = sign(p) SQRT(1 - y).  Interpolating
between the values (s_0,g(s_0)) and (s_1,g(s_1)) gives an improved
approximation s for "exact" intercept f, and locating the intersection
of the line (0,-s) through (p,q) with the upper semi-circle gives an
improved initial guess for (x,y).

My goal to calculate an improved initial guess without expending more
effort than roughly that for one Newton iteration, so that we jump
into the mode of its quadratic convergence rate as soon as practical.

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:57 PST
Oops, I left out a division by 2 in the penultimate paragraph of my last note.

A line there should have read:

  where y = (C*s - 1/s)/2 and x = sign(p) SQRT(1 - y)...

instead of:

  where y = (C*s - 1/s) and x = sign(p) SQRT(1 - y).

-- mathtalk-ga
Comments  
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 09:14 PST
 
Progress Report

1. The intuitive notion that (x,y) on the circle of radius r centered
at the origin will depend continuously on the parameters (p,q), a
point in the interior of the disk of that radius, leads to a mapping
of the open disk into the circle.

Such a mapping cannot be both continuous and onto because of
topological considerations.

2. Two simplifications that can be readily introduced are to eliminate
parameter r (by scaling x,y,p,q accordingly) and to combine m,n into a
ratio:

  P = p/r,  Q = q/r,  X = x/r,  Y = y/r

Then X = cos(t), Y = sin(t) and:

  P*cot(t) - Q  = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 )

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 13:11 PST
 
Ooops, I got the tangent confused with the cotangent in that last
equation,  and it should read:

  P*tan(t) - Q  = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 )

since y/x = sin(t)/cos(t) = tan(t).

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 03 Jan 2005 17:00 PST
 
Mathtalk,

The analysis I've posted here deals with minimizing the time it takes
a ray of light to travel from (x,r) to (p,q) according to Fermat's
Principle and it leads to 6th order polynomials no matter which
variable you solve for; x or y.

Another analysis would be a geometrical one as follows:

The equation of a circle in terms of an angle, a:

x = r*sin(a)
y = r*sin(b)

Using the rule of similar angles of a parallel lines we know that a is
also the angle that a ray of light traveling parallel to the y-axis
will strike the surface of the semicircle.

Using Snell's law: n*sin(a) = m*sin(b), where b will be the angle the
light will refract at the boundary of the semicircle.

Using the two point equation of a line through <x,y> and <p,q>: p-x = s * (q-y)
we need to find the slope s which in this case will be express as
run/rise.  Using geometrical reasoning it is known that s = tan(a -
b).

So dividing the line equation by (q-y) gives:

p-x
--- = s
q-y

Substituting for 'x', 'y', and 's' gives:

p-r*sin(a)
---------- = tan(a - b)
q-r*cos(a)

solving Snell's law for b:

b = arcsin(sin(a)/n)

a final substitution gives:

p-r*sin(a)
---------- = tan(a - arcsin(sin(a)/n)
q-r*cos(a)

Which if solved for 'a' would also give me the solution I'm looking for.

If it is impossible to solve this equation or the polynomial
analytically then I'd accept a solution which describes and implements
the "solve by factoring" algorithm used in symbolic algebra packages
such as Mathmatica, Maple, Maxima, or MacSym.  When I execute this
function the computer comes back with all the roots of the polynomial
including the complex roots.  What I'd do then is use this to find the
root that is the solution for my initial <p,q> point and then use the
computed <x,y> as the initial guess for the next <p+delta, q+delta> as
per your earlier suggestion.

I would also accept an analysis of this equation or the polynomial
similar to the one in this paper on page 19
(http://www.math.uab.edu/cl/cl1/details.ps) which guarantees that a
numerical method will always converge to the right solution given a
known starting value.

Let me know if there's anything more I can do to help you.

Thanks,

John.
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 03 Jan 2005 17:19 PST
 
Mathtalk,

Addressing your question about what happens when <p,q> = <0,0>

Under this condition, the time it takes light to travel that path is reduced to:

t = n/c * (r-y) + m/c * r

My original analysis minimized this function by taking the derivitave
with respect to y and setting it to zero, then solving for y.  In this
case that approach doesn't work since taking the derivitive leads to
dt/dy = -n/c which is a constant.  (Remember c here represents the
speed of light in air).  I don't know how to minimize this function
... maybe you do.

John.
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 18:16 PST
 
To minimize:

  t = n/c * (r-y) + m/c * r

with respect to y on the interval [-r,+r], take y = r.

Here the minimum occurs at a boundary point, rather than at an
interior "critical" point such as you might find by setting the
derivative to zero.

I have a minor correction to your alternative derivation; regarding Snell's Law:

  n*sin(a) = m*sin(b)

we obtain:

  b = arcsin( (n/m) * sin(a) )

This suggests a natural limitation on angle a (equivalent to my angle t), that:

  | sin(a) | <= m/n

in order that angle b is defined by this relation.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 04 Jan 2005 05:09 PST
 
Hi, John:

It appears that your problem has a left/right symmetry, so that:

  (x,y) the solution for (p,q)

implies:

  (-x,y) the solution for (-p,q)

but is not up/down symmetric in the corresponding sense.

However the formulation:

  m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2)

implies that p*(y/x) - q is nonnegative, so y/x >= q/p when p > 0.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 04 Jan 2005 09:46 PST
 
Mathtalk,

Going back to the original problem:

t = n/c * (r-y) + m/c * sqrt((x-p)^2+(y-q)^2)

and

x^2 + y^2 = r^2

you can either substitute for y or x.  

Substituting for x gives an equation in y:

t = n/c * (r-y) + m/c * sqrt((sqrt(-y^2+r^2)-p)^2+(-q+y)^2)

Substituing for y gives and equation in x:

t = n/c * (r-sqrt(-x^2+r^2)) + m/c * sqrt((x-p)^2+(sqrt(r^2-x^2)-q)^2)

My problem is to minimize either one (or both) of these equations. 
i.e. Take the derivitive with respect to the remaining variable, set
it equal to zero, and solve for the roots.  Doing it either way leads
to a 6th order polynomial.  Solving the equation for x leads to a
polynomial without the x^1 term.  Solving the equation for y leads to
a polynomial with all the terms.

This equation: 

m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2)

is simply the derivitive of the equation in x set to 0 with y
substituted back in.  This was my first attempt and it turned out to
be quite messy.  I mention this because it seems like your focusing in
on this equation and while solving it would answer my question, I
wanted to point out the other equations as well since they might be
easier to work with.

John.
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 04 Jan 2005 10:21 PST
 
Thanks, John.  I'm posting a mixture of Comments here, both for the
sake of giving a "progress" indicator and for the sake of your
feedback.

My understanding of the physics of the application is that, given a
point (p,q) inside a "crystal ball" you want to know where to shine a
light to illuminate that point.  Outside the ball and inside the ball
the light will travel in a straight line, but at the boundary (x,y)
the light beam is refracted due to the changing speed of light.

I can show a solution exists for p,q > 0, and by left/right symmetry
we can drop the restriction that p > 0.  However I'm expecting that
there are values q < 0 which have no solution.  The physical intuition
is that as a light beam approach the vertical edge x = r, there may be
points (p,q) "below" the path inside the ball that cannot be attained.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 05 Jan 2005 07:33 PST
 
I haven't been able to show directly the equivalence of the two
formulations, but they share a characteristic which I think can be
used to show the equivalence indirectly.  This is that (p,q) may be
replaced in either equation by any point inside the circle on a line
between (x,y) and (p,q).

The Snell's Law formulation:

  (p-x)/(q-y) = tan(a - b)

isolates all the dependence on (p,q) on one side of the equation, as
the angles a,b depend only on (x,y) and the ratio of refraction
indexes.

So adding a multiple of (p-x) to p and the same multiple of (q-y) to q
will leave the equation unchanged.  The situation is only a little
more complicated with the "original" formulation:

p*(y/x) - q = (n/m)*SQRT( (p-x)^2 + (q-y)^2 )

but the corresponding adjustments to (p,q) would proportionally change
both sides of the equation.

An indirect verification of equivalence can therefore be derived by
showing that the slopes of the lines as (p,q) approaches (x,y) are the
same for both equations, e.g. dp/dq = tan(a - b).

*   *   *   *   *   *   *   *   *   *   *   *   *   *   * 

Closer to your original request for an exact solution, I observe that
for the special case q = 0, a cubic equation can be formulated for x
with coefficients in p, r, and k = (n/m):

  p*(y/x) = k*SQRT( p^2 - 2px + r^2 )

After multiplying both sides by x and squaring:

  (p^2)*(r^2 - x^2) = (k^2)*(x^2)*(p^2 - 2px + r^2)

Therefore an "exact solution" for x can be given in this case using
Cardan's formula for roots of a cubic equation.  See for example:

http://answers.google.com/answers/threadview?id=433886

http://answers.google.com/answers/threadview?id=441538

*   *   *   *   *   *   *   *   *   *   *   *   *   *   * 

There would be a very simple exact solution if all of the lines (p,q)
to (x,y) described in the first section of this Comment were to
intersect at a common focus, say (0,f).  For then one has only to
intersect a line through (p,q) and (0,f) with the upper semi-circle
x^2 + y^2 = r^2 to find (x,y).

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 05 Jan 2005 11:17 PST
 
Mathtalk,

There would be a very simple exact solution if all of the lines (p,q)
to (x,y) described in the first section of this Comment were to
intersect at a common focus, say (0,f).  For then one has only to
intersect a line through (p,q) and (0,f) with the upper semi-circle
x^2 + y^2 = r^2 to find (x,y).

I agree that if all the lines (p,q) to (x,y) were to intersect at a
common point then the solution would be easy.  Unfortunatly this is
not the case for general values of n and m.

Your observation that "(p,q) may be replaced in either equation by any
point inside the circle on a line between (x,y) and (p,q)." could be
expoited to reduce the dimensionality of the problem if it could be
reformulated using this knowledge (something like a coordinate
transform or an equation using this property as the basis functions). 
This reduction in dimensionality may then
enable an algebriac solution.  

What do you think?

John.
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 05 Jan 2005 11:56 PST
 
Hi, John:

Yes, in a special case one can reduce the dimensionality to one (e.g q
= 0) and obtain a reduced degree polynomial problem with a "closed
form" solution.

However I still expect that with good initial approximations, the
numerical method will converge just as fast as any "exact" solution. 
I'm thinking of "projecting" points onto the x-axis (q=0) using an
approximate focal point (p=0) mostly as a technique for generating the
initial approximations.  I believe that for a "spherical" lens,
paraxial rays do have a fairly tight "focus"; it's the rays which fall
near |x| = r which are responsible for the majority of "spherical
aberration".  I think there is a critical angle:

  |y/x| = n/m

at which refraction changes to reflection that the math "mirrors" the
physical limitation in your model.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 07 Jan 2005 07:27 PST
 
Hi, John:

Here's my approach, still to be coded and tested.  I'm assuming n/m <
1 is the case of real interest.

Given an array of (p,q) points:

   . . . . . .
   . . . . . .
   . . . . . .

we solve (assuming these lie on one side of the y-axis, since
solutions are symmetric about that line) for the extreme corners and
midpoint, finding an approximate common focal point where the three
lines thru respective (x,y) and (p,q) nearly intersect.

Then use this approximate focal point (p',q') to produce initial
approximations (x',y') for each additional solution desired.  Two
Newton iterations should suffice to give all necessary accuracy; three
should converge to machine precision (digits of precision will double
with each iteration), assuming a fairly tight angular distribution
reasonably close to the central axis.

See my Request for Clarification above about the realistic size of the
angular range for the block of (p,q) points.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 09 Jan 2005 16:21 PST
 
Hi, John:

A couple of quick notes:

I found a way to show that the Snell's Law and Fermat's principle
formulations are equivalent for q < y.

I coded and tested Newton iterations for both versions.  They have
about the same operation count, and for n/m < 1, both seem to converge
satisfactorily within 2 or 3 iterations if given starting values of
only modest accuracy.

Based on the limited amount of testing done so far, the method based
on Fermat's principle seems to converge a little faster.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 20 Jan 2005 14:20 PST
 
The polynomial for x or y turns out to be sextic (of sixth degree),
but it seems reasonable that by choosing a different unknown the
problem may be restated in a form of lesser degree.  One suggestion of
this has already been seen when, by restricting q = 0, one obtains a
formulation in which x satisfies a cubic polynomial (whose
coefficients depend algebraically on p) and in which p satisfies a
quadratic polynomial (whose coefficients depend algebraically on x).

I'm especially interested in a formulation in terms of the y-intercept
(0,-f) of the line passing through (p,q), rather than in terms of
point (x,y) where the line intercepts the (unit) circle.  It seems to
me that the left/right symmetry of the problem may produce a reduction
in the degree of the problem.

For example, suppose that the coordinate -f of the y-intercept
satisfies an even polynomial of degree 6 (whose coefficients depend
algebraically on p,q).  The square of f must then satisfy a
corresponding cubic equation, so an explicit formula for f (in terms
of root extractions and field operations) could in principle be
written down.

If k = n/m, the dependence of f (which I've written with a negative
sign above because the light path incident on the upper semi-circle
will have y-intercept in the lower half-plane) on x,y can be stated:
                __________
        ky + k? 1 - kx
  f  = --------------------
             (1 - k)

This gives values of f that range from:
         ____
  f = k/?1-k   when (x,y) = (1,0)

to:

  f = k/(1-k)   when (x,y) = (0,1)

Even if this line of investigation doesn't produce the sought after
closed-form solution, it may still be useful in providing an
"approximate focal point".  That is, if we take an approximate
y-intercept (0,-f) and project through (p,q) to the point on the unit
circle (x,y), we may get a very satisfactory starting value for a
Newton iteration.

regards, mathtalk-ga
Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 27 Jan 2005 22:00 PST
 
Mathtalk,

I agree with your "gut" feeling that if I were just smart enough to
know how to reformulate the problem, I could get an algebriac
solution.

I'm thinking that your "y-intercept" idea is a good one.  A coordinate
system based on a point on the y-axis and an angle from vertical would
seem to exactly encapsulate the free variables in this problem
assuming of course that n/m < 1 (i.e. all things bend to the y-axis).

John.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy