Google Answers: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f

View Question

Q: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f ( Answered, 15 Comments )

Question

Subject: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
Category: Science > Math
Asked by: johnswanstone-ga
List Price: $100.00

Posted: 31 Dec 2004 22:30 PST
Expires: 30 Jan 2005 22:30 PST
Question ID: 449910

An algebraic solution is preferred as I'm going to have to solve this problem hundreds of thousands of times with different coefficients. Here are the coefficients: a = -4n^4(p^2+q^2) b = 4n^2p(m^2p^2+n^2p^2+m^2q^2+n^2q^2+n^2r^2) c = -m^4p^4-n^4p^4-2m^2n^2p^4-m^4q^4-n^4q^4+ 2m^2n^2q^4-2m^4p^2q^2-2n^4p^2q^2-n^4r^4- 2n^4p^2r^2-2m^2n^2p^2r^2+2n^4q^2r^2+ 2m^2n^2q^2r^2 d = -4m^2n^2pr^2(p^2+2q^2) e = 2m^2p^2r^2(m^2p^2+n^2p^2+m^2q^2+n^2q^2+n^2r^2) f = -m^4p^4r^4 where m,n,p,q,r are arbitrary real number constants subject to the following constraints: n>0 m>0 r>0 -r<p<r -r<q<r p^2+q^2<r^2 Each time I solve this p and q will be different. r,n,m will change only every hundred thousand solutions or so. This is the reason I'm looking for an exact solution. Numerical methods will be too slow for the number of times I need to solve this. Alternately, each n,m,p,q,r I choose will reduce the coefficients to specific numbers. Then in my symbolic math software, I can use a "factor" command to solve each particular polynomial. I don't know what's going on behind the scenes, or if it is fast enough to do a hundred thousand of them in say 3 to 5 seconds on a modern PC, but if it is, then a description and implementation of the ?factor? algorithm in C or C++ would be acceptable as a solution to this problem. For those of you who are curious I will describe the origin of this problem: The problem I'm working on involves performing refraction distortion correction of the path of a laser beam as it passes through a human eye. The air/eye boundary is modeled in my problem as a hemisphere. In actuality it will be an arc of some radius and some extent which will be determined by the data. see: http://answers.google.com/answers/threadview?id=449830 The device can sample data at 256 different points (give or take) in time for each beam shot into the eye (using optical coherence tomography). This process is repeated 512 times (give or take), each time with the laser beam translated by a few 10's of micro-meters across the eye. So I have this grid of data say 256 points deep and 512 points across. My job is create an image of the interior of the eye from this data. To do this I place a rectangular grid across the eye and then for each point <p,q> in the grid I ask what is the <x,y> value on the surface of the eye that would cause the laser beam to reflect into <p,q>. If I get lucky this will correspond to a single column in my data grid. Now using the distance between <p,q> and <x,y> I can find which data sample that corresponds to that <p,q> in the eye. If I'm unlucky (which I will be most of the time) I can find the pair of data columns that bracket the <x,y> point and interpolate the number I'm looking for from there. Ok, now here?s the analysis I?ve done to date: Let?s start with a hemisphere of radius r centered at the origin and extending from ?r to r on the x-axis and from 0 to r on the y-axis. Let?s assume that everything outside the hemisphere has an index of refraction n and everything inside has an index of refraction m. Let?s pick any point <x,y> on the hemisphere and any point <p,q> inside the hemisphere. The time it takes light to travel from the point <x,r> to <x,y> is: n/c * (r-y) The time it takes light to travel from <x,y> to <p,q> is: m/c * sqrt((x-p)^2 + (y-q)^2)) And the total time is the sum of these two terms. Knowing that x^2+y^2=r^2 let?s us substitute and eliminate x in the prior equation. Now applying Fermat?s Principle, which is dt/dy = 0 let?s us differentiate the substituted equation of time with respect to y and set it?s value to 0. A bit of algebra leads us to the following equation: m(py/x-q) = nsqrt(r^2-2px+p^2-2qy+q^2) Squaring both sides and a bit of algebra leads to: [(m^2p^2)/n^2]* y^2/x^2 ? [2(m^2pq)/n^2] y/x + [2p] x + [2q] y + [(m^2*q^2)/n^2] - p^2 - q^2 - r^2 = 0 We still know that: x^2 + y^2 = r^2 So we have 2 equation and 2 unknowns. Substituting y = sqrt(r^2 ? x^2) and squaring both sides again and quite a bit of algebra leads to the original 6th degree polynomial. For those you who are adventurous: Continuing with the original problem description ? Next I have to repeat this process for each boundary within the eye. The problem is that all boundaries other than the first need to be represented by splines (which are piece-wise polynomials with the property of being differential in the 2nd degree at the piece-wise boundaries). My thoughts are that I could solve for <x,y> on each polynomial similarly to what I?ve describe above for the hemisphere and exclude the solution if it is not in the "piece" of the spline that polynomial defines. Any thoughts? Suggestions? Ok. Thanks for getting this far. I hope to hear from you soon. J.
Clarification of Question by johnswanstone-ga on 31 Dec 2004 22:33 PST Where I say: "I ask what is the <x,y> value on the surface of the eye that would cause the laser beam to reflect into <p,q>." I really mean: "I ask what is the <x,y> value on the surface of the eye that would cause the laser beam to refract into <p,q>."
Request for Question Clarification by mathtalk-ga on 01 Jan 2005 14:05 PST Hi, johnswanstone-ga: A sixth degree polynomial has six roots in the complex plane counting multiplicity. However your application would suggest you are interested in only one (real) solution for x. If this is the case, how do you distinguish the one real root which is of interest from those which are "artifacts"? From your description the parameters p,q vary constantly and parameters r,m,n vary seldom. If the parameters p,q vary by slight amounts from a previous "setting", then it is especially attractive to use a numerical solution using the prior solution as an initial approximation. I believe that if you are only interested in a single real solution which can be succinctly identified (per my first paragraph) then a numerical solution will likely outperform any "exact" symbolic solution in terms of speed. Indeed, because your problem appears to have five independent parameters, it is unlikely that your general case is more special than the general sixth-order polynomial, whose "symbolic" solution requires operations that provide solutions to these problems: x^6 + x = A x^5 + x = B along with solutions to dozens of polynomial "subproblems". Again, your description of the underlying application makes it sound as though for given values of r,m,n, you will vary p,q within the circular domain: p^2 + q^2 < r^2 to solve for corresponding x,y values. The physical interpretation suggests that x,y will vary continuously with p,q (while r,m,n are fixed), and thus that with the possible exception of the first solution, an excellent initial "guess" for the desired solution is available as the result of a previous problem. In the absence of a "multiple" root, a numerical method such as Newton's will converge very rapidly starting from a sufficiently accurate initial approximation. One further request for clarification: Is it possible that degree of the polynomial formulation would be lower if both x,y were retained in the equation(s)? However if you are truly interested in knowing just what an "exact" solution to the stated problem looks like, then I will be happy to stick with that approach. regards, mathtalk-ga
Clarification of Question by johnswanstone-ga on 01 Jan 2005 15:38 PST mathtalk, Thanks for your reply. It is true that I'm only looking for a single real solution to this problem. The rest of the solutions are artifacts caused by the "squaring of both sides" operations I did (at least this is what I think it is caused by). I agree that <x,y> will be continious with changing <p,q> and I agree with your analysis that a numerical solution could be faster for finding <p+delta, q+delta> from a given <p,q> <x,y> solution. The problem is ... how do you find the initial solution? And I'd really like to be able to compare the numerical to the algebriac solution. I also would like to know how to find the algebriac solution for my own education. Discovering which root is the "correct" one for my problem is simple enough. Just see which root value makes following equation true: m(py/x-q) = nsqrt(r^2-2px+p^2-2q*y+q^2) For example, setting: n=1 m=1.5 r=1 q=0.1 p=0.5 yields the following four real roots: x=-0.55208096577064258 and y=0.83379050560302892 x=-0.44209591096428252 and y=0.89696778398594745 x=0.68024533659318476 and y=0.73298450327631404 x=0.75808380947988874 and y=0.65215714195618513 plugging them into the above equation shows that x=0.68024533659318476 is the one that I'm really looking for. All the others don't balance the equation (which I find to be pretty strange. Maybe you can expain it). John.
Clarification of Question by johnswanstone-ga on 03 Jan 2005 06:17 PST mathtalk, Are you working on this? Is there any progress to report? Should I look elsewhere for an answer? hello, is there anybody out there? john.
Request for Question Clarification by mathtalk-ga on 03 Jan 2005 08:59 PST Hi, johnswanstone-ga: I am working on your problem, specifically to find solutions of the nonpolynomial system directly. Looking at the "original" formulation: m(py/x-q) = nsqrt(r^2-2px+p^2-2q*y+q^2) x^2 + y^2 = r^2 I'm wondering what "solution" you desired for p = q = 0 ? Is this an exceptional case, for which no solution is possible? If I'm parsing the equation correctly, it amounts to 0 = nr. However for certain values of m,n there could be a "limiting" solution x = 0, y = r. regards, mathtalk-ga
Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:22 PST Hi, johnswanstone-ga: When you wrote, "The equation of a circle in terms of an angle, a: x = rsin(a) y = rsin(b)" did you mean x = rcos(a), y = rsin(a), so that x^2 + y^2 = r^2 ? regards, mathtalk-ga
Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:27 PST Never mind, I see from the substitution step that you are taking: x = rsin(a) y = rcos(a) which makes your angle a the "complement" of my angle t. regards, mathtalk-ga
Request for Question Clarification by mathtalk-ga on 04 Jan 2005 19:50 PST Hi, John: I'm having some doubt about how you've applied Fermat's principle. We can agree that to find the path between two fixed points, Fermat's principle selects the minimum travel time. However in your application the path from (x,r) to (p,q) only involves one fixed point, namely (p,q). The point (x,r) depends on the variable y. I'm trying to reconcile the version obtained from Snell's Law with the "original" formulation. I'll feel more confident in coding the solution if I know these are equivalent. regards, mathtalk-ga
Clarification of Question by johnswanstone-ga on 05 Jan 2005 12:08 PST Mathtalk, Given a point <p,q> and applying Fermat's Principle leads to minimization of a function in time which leads to the roots of a 6th order polynomial expressed in either x or y (each having different coefficients of course). Once either x or y is found the other can be immediatly calculated by x^2 + y^2 = r^2 and once y is known <r-y> is also known since r is a constant (i.e. input variable). In my example: n=1 m=1.5 r=1 q=0.1 p=0.5 the solution is: x=0.68024533659318476 and y=0.73298450327631404 which can be verified by: solving for a using: x = r * sin(a) which gives: a = 0.74809729181091122 and b using: b = arcsin( (n/m) * sin(a) ) which gives: b = 0.47068498620621618 and finally verifying this equation is true: (x-p) = tan(a - b) * (y-q) which gives: 0.18024533659318487 = 0.18024533659318503 John.
Request for Question Clarification by mathtalk-ga on 07 Jan 2005 06:24 PST I see that for your sample problem the Fermat principle and Snell's Law approach agree on the solution, so I'm assuming that there's an equivalence there. But I've been unable to prove it. On another practical note, what is the angular range of the set of (p,q) points you will need to solve "at one blow"? I'm guessing that a fully dilated pupil/iris will be less than a 30 degree aperture, but that from your description your field of illumination is intended to be quite a bit smaller than this. So the question is whether we are trying to quickly find 256x512 solutions whose angular range is more like 10 degrees or like 1 degree? Or even less? regards, mathtalk-ga
Clarification of Question by johnswanstone-ga on 07 Jan 2005 13:28 PST Mathtalk, I'm not really sure of what angular field my data encompasses since I haven't actually fit a circle to it and measured the angle. But if you're curious, I can show you a raw data image. http://www.frontiernet.net/~jfss/JAOSnormal995bk.%20%20%202.BMP John.
Request for Question Clarification by mathtalk-ga on 08 Jan 2005 08:25 PST Hi, John: In the image shown the angular measurement of the outer circular edge is about 67.5 degrees. My question has to do with the "patch" of p,q coordinates for which you want a split-second computation. Since you described these points as being a 256x512 point array inside the radius of the outer circle, I thought they were perhaps a localized part of the interior, not extending over the entire image. regards, mathtalk-ga
Clarification of Question by johnswanstone-ga on 09 Jan 2005 21:12 PST Mathtalk, Well, My job is to perform refraction correction on the source image. So the first thing I need to do is correct the refraction distortion caused by the initial air/eye boundary. This will correct all the data inside the cornea (assuming the eye has a uniform index of refraction) and then perform the correction again for the back surface of the cornea using the results of the first correction as input. The 256x512 "patch" was a rough approximation for the number of pixels in the image I need to correct. I understand that since the refraction of the light rays will vary continuously with the distance from the center of the "circle" that solutions for <p,q> points that are "close" to each other will also be "close" to each other and that provides for very fast convergence of a numerical root finding method. What I don't know how to do is find the initial solution to start the whole thing off. How do you plan on doing that? John.
Request for Question Clarification by mathtalk-ga on 10 Jan 2005 20:32 PST Hi, John: In my experiments convergence was quick even if x = 0 was used as the starting approximation. However I don't recommend that because it is possible, especially if q < 0, for the first step of the method to "overshoot" the boundary x = 1, so it's prudent to be a little more careful in arranging the computation. Here's one approach that I think will work nicely. Take a closely spaced "horizontal" line of points (p,q) sharing a common value of q. Start with the point where \|p\| is closest to 0 and work outward, using x = p for the first point's initial guess and the converged result of each point as the starting value for the next point. Since the exact solution is x = 0 when p = 0, x = p is an excellent starting value close to the y-axis. Alternatively one could take a "vertical" line of points and start from the top (closest to the circle), also using x = p as the first point's initial guess. Since the refraction near the "lens surface" is slight, again this is an excellent approximation. From what I've tried I'd say the Fermat principle iteration works best when your intial guesses underestimate x, and the Snell's Law iteration works best when the initial guesses overestimate x. The suggestions above for x = p are of course underestimates (assuming n/m < 1). The only thing I'd be worried about in recommending solely relying on the Fermat formulation is if for some cases you need to work with \|x\| close to 1. However, given the application parameters discussed earlier (an angular range probably within +45 to -45 degrees), I don't think this is really an issue. My rough operation count for each Newton iteration is: 10 multiplications 11 additions 3 divisions 2 square roots With a bit of thought one or two of the multiplications per iteration can probably be "amortized" over the convergence sequence, but since this is at worst two or three iterations with starting values of modest accuracy, there's not a whole lot to save. If you need to solve this for every pixel more or less, even a single Newton iteration at each point is probably suboptimal. The Newton iterations can be done at more widely spaced points, filling in between them with either a linear or an inverse quadratic interpolation. If your hardware platform has a floating point unit, the ultimate performance will be eked out by "coding to the metal" and assigning floating point registers to hold specific intermediate computations, so that all or most of them involve only register operands. regards, mathtalk-ga
Request for Question Clarification by mathtalk-ga on 21 Jan 2005 20:35 PST Hi, johnswanstone-ga: Using MAXIMA and PARI/GP, I calculated the Galois group of the sixth degree polynomial for x which results from the parameters: p = 0.5, q = 0.1, n = 1, m = 1.5, r = 1 as in the example you worked with previously. After scaling one obtains a monic irreducible polynomial: x^6 - 738x^5 + 135226x^4 + 10513152x^3 - 3735673344x^2 + 24637221273600 whose roots are an integer multiple 208 times the roots of your original problem. The Galois group is S_6, the full symmetric group on the six roots. Consequently the root x cannot be expressed in terms of only roots and rational expressions based on the integers. On a more positive note, I can provide a numerical method which converges quickly and robustly for every (p,q) in the upper semi-circle: p^2 + q^2 < r^2 q > 0 implemented in C (for example), subject to the assumption n/m < 1. regards, mathtalk-ga

Answer

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
Answered By: mathtalk-ga on 30 Jan 2005 22:10 PST

Hi, John: As you've outlined, a vertical beam of light strikes a point on an upper semi-circle: x� + y� = r�, y > 0 where the index of refraction changes from n above the semi-circle to m below. According to Snell's Law, the light will deflect from a straight path toward the normal line when n < m and away from it when n > m. Given your application of correcting images of the eye, as light passes from air (index of refraction ~ 1) to the cornea, lens, and aqueous media beyond (index of refraction > 1), we will primarily treat the case where n < m. We will refer to ratio k = n/m rather than to the separate indices. By choosing units of length the radius of the semi-circle can be assumed r = 1. Applying Fermat's principle you obtained for point (p,q) within the upper semi-circle: py - qx ------- = k SQRT((p-x)� + (q-y)�) x which we wish to solve for (x,y). Polynomial equation and extra roots ----------------------------------- In deriving a polynomial from this and the equation x� + y� = 1, a number of "artifact" roots are introduced, ie. values x satisfying that polynomial formulation but not the original one. These extra roots are added by steps that multiply both sides by nonconstant polynomials or that square both sides of an equation. For example the equation x = 1 becomes x� = 1 when both sides are squared and acquires a "false root" x = -1 in the process. Nonetheless the desired root x is among those which satisfy: ax^6 + bx^5 + cx^4 + dx^3 + ex^2 + f = 0 where: a = 4(p� + q�)(k�)� b = -4((k� + 1)(p� + q� + 1) - 1)k�p c = (k� - 1)(q�)� + 2(((k�)� + 1)(p� - 1) - (k� - 1))q� + ((k� + 1)p�)� + 2(k� + 1)k�p� + (k�)� d = 4k�p(p� + 2q�) e = -2((k� + 1)(p� + q� + 1) - 1)p� f = (p�)� Note that while both odd and even powers of p appear among these coefficients, only even powers of q are present. This explains some of the artifact roots. Where a solution x for (p,q) in the upper semi-circle exists, another solution x for mirror-image point (p,-q) in the lower semi-circle may also exist. This accounts for an extra root which exists for the sample data: p = 0.5 q = 0.1 k = (1.0/1.5) = 2/3 While the refracted path through the point on the unit circle: x = 0.68024533659318476 y = 0.73298450327631404 passes through the point (p,q), this artifact solution: x = 0.75808380947988874 y = 0.65215714195618513 passes through (p,-q). Similarly k only appears in even powers in the polynomial, and thus the other two real "solutions" actually satisfy for (p,q) or (p,-q): py - qx ------- = -k SQRT((p-x)� + (q-y)�) x It is easy to see that these last artifact roots were introduced by squaring both sides of the Fermat's principle formulation. Impossibility of solution by radicals ------------------------------------- Earlier in an RFC I reported the result of using MAXIMA and PARI/GP to determine if the particular sixth-degree polynomials produced in this application can be solved by radicals, ie. by algebraic formulas that require roots and fields operations acting on the coefficients of the polynomial. The result was that it is impossible to do so for the particular case p = 0.5, q = 0.1, k = 2/3 which you had used, and therefore impossible in general for these sixth-degree polynomials. However we did note that for the case q = 0, the resulting polynomial for x is only cubic and thus can be solved by radicals. Snell's Law formulation ----------------------- A third formulation of the problem is possible using Snell's Law. By a bit of trigonometry: p-x ----- = tan(a - b) q-y where a is the angle of vertical to the normal on the circle at (x,y) and b is the angle of refracted path to the same normal. Snell's Law says that: n sin(a) = m * sin(b) and from the relations sin(a) = x, cos(a) = y, plus: sin(b) = kx cos(b) = SQRT(1 - k�x�) one can rewrite tan(a - b) = sin(a - b)/cos(a - b) in terms of (x,y): p - x (1 - k�)x ----- = --------------------- q - y y + SQRT(k� - (k�x)�) If we think about both sides as a difference quotient (slope formula) involving (x,y) and one other point, this can be interpreted as saying that (p,q) lies on the straight line connecting the two points: (x,y) and (k�x, -SQRT(k� - (k�x)�)) Note that while (x,y) lies on the upper semi-circle, the latter point falls into the lower half-plane because of the negative y-coordinate. It's possible then to determine for (x,y) on the upper semi-circle the y-intercept of the refracted path is (0,-f), where: f = ( k�y + SQRT(k� - (k�x)�) )/(1 - k�) This "focal" f has the limit of k/(1-k) as points (x,y) tend to (0,1) and another limit k/SQRT(1 - k�) as (x,y) tends to (1,0) or (-1,0). Any point (x,y) on the upper semi-circle will have its refracted path through (0,-f) for some f intermediate between these limits. Newton iterations ----------------- Given a differentiable function F(x) and an initial guess x for a root of F(x) = 0, we compute successive Newton approximations: x := x - F(x)/F'(x) Any of the three formulations could be used as the basis for a Newton iteration. The polynomial form would make it easy in some respects to code the computation of the polynomial and its derivative together in a common loop over the coefficients. However the polynomial form has two drawbacks in this application. It has artifact roots that we'd need to be on guard against, and its coefficients require a greater amount of calculation from (p,q) each time they change than other formulations. The subject of starting values will be taken up in the next section. For the moment we will concern ourselves with the calculations needed for the Newton iteration itself. The Fermat principle can be reformulated in a variety of ways, even into the Snell's law form (and vice versa). So we don't claim that the particular choice of equation made here is a necessary one. Clearing denominators and moving all terms to one side produces: F(x) = py - (q + kd)x = 0 where d = SQRT((p-x)� + (q-y)�). Differentiating with respect to x: F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q - y)y')/d. Alternatively we can use the Snell's law formulation in a similar way, clearing the denominators and grouping all terms on one side: G(x) = (p - x)(y + z) - (1 - k�)(q - y)x = 0 where z = SQRT(k� - (k�x)�). Differentiating with respect to x: G'(x) = (p - x)(y' + z') - (y + z) - (1 - k�)(q - y - y'x) where y' is as above and z' = -(k�)�x/z. Starting values --------------- Once a Newton iteration is available for a problem that requires modest computation for an exact derivative, as here, the main difficulty left is to settle upon initial guesses that are good enough to allow the asymptotic quadratic-convergence to quickly appear. Here we have a couple of reasonable starting values that bracket the true root which are nearly "free" in terms of computation required. The point (p,SQRT(1 - p�)) would lie on the straight path through (p,q) if k were 1, ie. if there were no refraction. On the other hand ( p/SQRT(p�+q�), q/SQRT(p�+q�) ) would lie on the line through (p,q) and the origin, which is the limiting path (the normal line) if k were 0. It is easy to show that the functions F(x) and G(x) both have a change in sign between these two points on the upper semi-circle, thus proving that a root exists between them in each case. A simple linear interpolation between these values may be the best choice, certainly in the case that (p,q) is close to either the semi-circle or to the y-axis. One can construct narrower bounds using a different pair of lines passing through (p,q), at a cost of some additional computation. For example, if we take the limiting points (0,-f) found earlier: f = k/(1 - k) and k/SQRT(1 - k�) then lines from these two y-intercepts through (p,q) will intersect the unit semi-circle in a pair of points that more tightly bracket the root we want. Another thought here is that instead of using linear interpolation between the respective values of F(x) or G(x) at such a bracketing pair to obtain an initial guess for the Newton iteration, we can fairly easily find the intersection of the two refracted paths from those two points in the lower half-plane, then project from that "false focus" through (p,q) to a "guess" on the unit semi-circle. regards, mathtalk-ga
Request for Answer Clarification by johnswanstone-ga on 01 Mar 2005 14:24 PST mathtalk, You said: On a more positive note, I can provide a numerical method which converges quickly and robustly for every (p,q) in the upper semi-circle: p^2 + q^2 < r^2 q > 0 implemented in C (for example), subject to the assumption n/m < 1. regards, mathtalk-ga So where is this code? John.
Clarification of Answer by mathtalk-ga on 02 Mar 2005 20:46 PST Hi, John: Quite right, I promised C code and I'll deliver it. regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 02 Mar 2005 22:27 PST For starters here's some basic C code to run the Newton iterations for both the Fermat and Snell's Law formulations. I picked a fixed error tolerance, but if it were desirable we could make that a parameter for both routines. When p = 0, the explicit answer x = 0 is known, so we simply return that. For both routines p,q are the coordinates of the point inside the circle, normalized by dividing them by the circle's radius as discussed previously. In the case of Fermat Principle formulation, k = n/m. For Snell's Law we supply k2 as k� for convenience. In both cases parameter x is a suitable initial guess. Next I provide a test harness that sets initial values and determines the speed with which each formulation can solve the equation. #include <math.h> double newton_fermat( double p, double q, double k, double x) { double y, dydx, d, dddx, f, dfdx, xd; double tol = 1.e-7; /* tolerance could be a parameter / if (p == 0.) return 0.0; / F(x) = py - (q + kd)x = 0 where y = SQRT(1 - x�) and d = SQRT((p-x)� + (q-y)�) F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q-y)y')/d / while(1) { y = sqrt(1. - xx); d = sqrt((p-x)(p-x) + (q-y)(q-y)); f = py - (q + kd)x; dydx = -x/y; dddx = -(p - x + (q-y)dydx)/d; dfdx = pdydx - (q + kd) - kxdddx; xd = -f/dfdx; if (fabs(xd) < tol) return x + xd; x += xd; } } double newton_snell( double p, double q, double k2, double x) { double y, dydx, z, dzdx, g, dgdx, xd; double tol = 1.e-7; /* tolerance could be a parameter / if (p == 0.) return 0.0; / G(x) = (p-x)(y+z) - (1-k�)(q-y)x = 0 where y = SQRT(1 - x�) and z = SQRT(k� - (k�x)�) G'(x) = (p-x)(y'+z') - (y+z) - (1-k�)(q - y - y'x) where y' = -x/y and z' = -(k�)�x/z / while(1) { y = sqrt(1. - xx); z = sqrt(k2 - (k2x)(k2x)); g = (p-x)(y+z) - (1-k2)(q-y)x; dydx = -x/y; dzdx = -k2k2x/z; dgdx = (p-x)(dydx + dzdx) - (y+z) - (1-k2)(q - y - dydx*x); xd = -g/dgdx; if (fabs(xd) < tol) return x + xd; x += xd; } } regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:37 PST Before writing the C code to construct an initial guess for Newton's method, let me present the formulas for some of the underlying concepts. We kicked around the idea before of an "approximate" focus as light paths refracted by the upper semi-circle intersect the y-axis. Using the Snell's Law formulation, it is possible to express this y-intercept (0,-f) either in terms of x or y: (p - x)(y + SQRT(k� - (k�x)�)) - (1 - k�)(q - y)x = 0 Since this equation holds for every point (p,q) on the refracted path from (x,y), we can let p approach zero and q approach -f, and after factoring out x: (y + SQRT(k� - (k�x)�)) + (1 - k�)(f + y) = 0 A bit of algebra yields: k�y + SQRT(k� - (k�x)�) f = ------------------------ 1 - k� It is convenient to introduce constant C = 1/k� - 1, allowing us to write: f = (y + SQRT(y� + C))/C Furthermore this relation can be inverted, giving y in terms of f: y = (Cf - 1/f)/2 We should bear in mind that y,f > 0 in the context of our problem (having chosen (0,-f) as the notation for the y-intercept because it falls below the x-axis). Also the sign of x = � SQRT(1 - y�) is chosen to agree with the sign of p. Given that we can easily translate between locating a point (x,y) on the upper semi-circle and the corresponding "focus" (0,-f) on the lower y-axis, a more flexible approach to finding an initial guess may be sought. First note that if we have points (x_a,y_a) and (x_b,y_b) that "bracket" the solution (x,y) corresponding to (p,q) on the same side of the y-axis as (p,q), then their "foci" (0,-f_a) & (0,-f_b) will bracket the "true" intercept (0,-f). That is: if p > 0, 0 < x_a < x < x_b <==> 0 < f_b < f < f_a Earlier we discussed one choice of "bracketing" points: (x_a,y_a) = (p, SQRT(1 - p�)) (x_b,y_b) = (p/SQRT(p�+q�),q/SQRT(p�+q�)) These are respectively the results of drawing a vertical line and a radial line through (p,q) and finding the intersection with the upper semi-circle. Alternatively we can find where the line from some point (0,-s) on the lower y-axis through (p,q) intersects the upper semi-circle. If s = f, we get the exact solution (x,y) in this way. Again some formulas are needed. Since the slope of this line is m = (q+s)/p, we have to solve: y = mx - s and x� + y� = 1 being careful to distinguish which quadratic root is on the upper semi-circle. The answer turns out to be: x = (ms � SQRT(m� - s� + 1))/(m� + 1) with the sign � chosen to agree with p. Instead of the vertical and radial lines, which would correspond to choosing s to be +oo or zero, we might use: s_0 = k/(1-k) and s_1 = k/SQRT(1 - k�) which represent the limiting extremes of the y-intercepts. I.e., refracted paths which start close to the y-axis, x = 0, will intersect the y-axis slightly above (0,-s_0), while paths that start nearly at the endpoints of the upper semi-circle are refracted more sharply and intersect slightly below (0,-s_1). For the sake of illustration, here is how one might construct an improved initial guess using these ideas. The Snell's Law formulation can be restated: G(x) = g(s) = p(s+y) - (s+q)x = 0 where y = (Cs - 1/s) and x = sign(p) SQRT(1 - y�). Interpolating between the values (s_0,g(s_0)) and (s_1,g(s_1)) gives an improved approximation s for "exact" intercept f, and locating the intersection of the line (0,-s) through (p,q) with the upper semi-circle gives an improved initial guess for (x,y). My goal to calculate an improved initial guess without expending more effort than roughly that for one Newton iteration, so that we jump into the mode of its quadratic convergence rate as soon as practical. regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:57 PST Oops, I left out a division by 2 in the penultimate paragraph of my last note. A line there should have read: where y = (Cs - 1/s)/2 and x = sign(p) SQRT(1 - y�)... instead of: where y = (Cs - 1/s) and x = sign(p) SQRT(1 - y�). -- mathtalk-ga

Comments

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 09:14 PST

Progress Report

1. The intuitive notion that (x,y) on the circle of radius r centered
at the origin will depend continuously on the parameters (p,q), a
point in the interior of the disk of that radius, leads to a mapping
of the open disk into the circle.

Such a mapping cannot be both continuous and onto because of
topological considerations.

2. Two simplifications that can be readily introduced are to eliminate
parameter r (by scaling x,y,p,q accordingly) and to combine m,n into a
ratio:

  P = p/r,  Q = q/r,  X = x/r,  Y = y/r

Then X = cos(t), Y = sin(t) and:

  P*cot(t) - Q  = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 )

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 13:11 PST

Ooops, I got the tangent confused with the cotangent in that last
equation,  and it should read:

  P*tan(t) - Q  = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 )

since y/x = sin(t)/cos(t) = tan(t).

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 03 Jan 2005 17:00 PST

Mathtalk,

The analysis I've posted here deals with minimizing the time it takes
a ray of light to travel from (x,r) to (p,q) according to Fermat's
Principle and it leads to 6th order polynomials no matter which
variable you solve for; x or y.

Another analysis would be a geometrical one as follows:

The equation of a circle in terms of an angle, a:

x = r*sin(a)
y = r*sin(b)

Using the rule of similar angles of a parallel lines we know that a is
also the angle that a ray of light traveling parallel to the y-axis
will strike the surface of the semicircle.

Using Snell's law: n*sin(a) = m*sin(b), where b will be the angle the
light will refract at the boundary of the semicircle.

Using the two point equation of a line through <x,y> and <p,q>: p-x = s * (q-y)
we need to find the slope s which in this case will be express as
run/rise.  Using geometrical reasoning it is known that s = tan(a -
b).

So dividing the line equation by (q-y) gives:

p-x
--- = s
q-y

Substituting for 'x', 'y', and 's' gives:

p-r*sin(a)
---------- = tan(a - b)
q-r*cos(a)

solving Snell's law for b:

b = arcsin(sin(a)/n)

a final substitution gives:

p-r*sin(a)
---------- = tan(a - arcsin(sin(a)/n)
q-r*cos(a)

Which if solved for 'a' would also give me the solution I'm looking for.

If it is impossible to solve this equation or the polynomial
analytically then I'd accept a solution which describes and implements
the "solve by factoring" algorithm used in symbolic algebra packages
such as Mathmatica, Maple, Maxima, or MacSym.  When I execute this
function the computer comes back with all the roots of the polynomial
including the complex roots.  What I'd do then is use this to find the
root that is the solution for my initial <p,q> point and then use the
computed <x,y> as the initial guess for the next <p+delta, q+delta> as
per your earlier suggestion.

I would also accept an analysis of this equation or the polynomial
similar to the one in this paper on page 19
(http://www.math.uab.edu/cl/cl1/details.ps) which guarantees that a
numerical method will always converge to the right solution given a
known starting value.

Let me know if there's anything more I can do to help you.

Thanks,

John.

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 03 Jan 2005 17:19 PST

Mathtalk,

Addressing your question about what happens when <p,q> = <0,0>

Under this condition, the time it takes light to travel that path is reduced to:

t = n/c * (r-y) + m/c * r

My original analysis minimized this function by taking the derivitave
with respect to y and setting it to zero, then solving for y.  In this
case that approach doesn't work since taking the derivitive leads to
dt/dy = -n/c which is a constant.  (Remember c here represents the
speed of light in air).  I don't know how to minimize this function
... maybe you do.

John.

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 03 Jan 2005 18:16 PST

To minimize:

  t = n/c * (r-y) + m/c * r

with respect to y on the interval [-r,+r], take y = r.

Here the minimum occurs at a boundary point, rather than at an
interior "critical" point such as you might find by setting the
derivative to zero.

I have a minor correction to your alternative derivation; regarding Snell's Law:

  n*sin(a) = m*sin(b)

we obtain:

  b = arcsin( (n/m) * sin(a) )

This suggests a natural limitation on angle a (equivalent to my angle t), that:

  | sin(a) | <= m/n

in order that angle b is defined by this relation.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 04 Jan 2005 05:09 PST

Hi, John:

It appears that your problem has a left/right symmetry, so that:

  (x,y) the solution for (p,q)

implies:

  (-x,y) the solution for (-p,q)

but is not up/down symmetric in the corresponding sense.

However the formulation:

  m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2)

implies that p*(y/x) - q is nonnegative, so y/x >= q/p when p > 0.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 04 Jan 2005 09:46 PST

Mathtalk,

Going back to the original problem:

t = n/c * (r-y) + m/c * sqrt((x-p)^2+(y-q)^2)

and

x^2 + y^2 = r^2

you can either substitute for y or x.  

Substituting for x gives an equation in y:

t = n/c * (r-y) + m/c * sqrt((sqrt(-y^2+r^2)-p)^2+(-q+y)^2)

Substituing for y gives and equation in x:

t = n/c * (r-sqrt(-x^2+r^2)) + m/c * sqrt((x-p)^2+(sqrt(r^2-x^2)-q)^2)

My problem is to minimize either one (or both) of these equations. 
i.e. Take the derivitive with respect to the remaining variable, set
it equal to zero, and solve for the roots.  Doing it either way leads
to a 6th order polynomial.  Solving the equation for x leads to a
polynomial without the x^1 term.  Solving the equation for y leads to
a polynomial with all the terms.

This equation: 

m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2)

is simply the derivitive of the equation in x set to 0 with y
substituted back in.  This was my first attempt and it turned out to
be quite messy.  I mention this because it seems like your focusing in
on this equation and while solving it would answer my question, I
wanted to point out the other equations as well since they might be
easier to work with.

John.

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 04 Jan 2005 10:21 PST

Thanks, John.  I'm posting a mixture of Comments here, both for the
sake of giving a "progress" indicator and for the sake of your
feedback.

My understanding of the physics of the application is that, given a
point (p,q) inside a "crystal ball" you want to know where to shine a
light to illuminate that point.  Outside the ball and inside the ball
the light will travel in a straight line, but at the boundary (x,y)
the light beam is refracted due to the changing speed of light.

I can show a solution exists for p,q > 0, and by left/right symmetry
we can drop the restriction that p > 0.  However I'm expecting that
there are values q < 0 which have no solution.  The physical intuition
is that as a light beam approach the vertical edge x = r, there may be
points (p,q) "below" the path inside the ball that cannot be attained.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 05 Jan 2005 07:33 PST

I haven't been able to show directly the equivalence of the two
formulations, but they share a characteristic which I think can be
used to show the equivalence indirectly.  This is that (p,q) may be
replaced in either equation by any point inside the circle on a line
between (x,y) and (p,q).

The Snell's Law formulation:

  (p-x)/(q-y) = tan(a - b)

isolates all the dependence on (p,q) on one side of the equation, as
the angles a,b depend only on (x,y) and the ratio of refraction
indexes.

So adding a multiple of (p-x) to p and the same multiple of (q-y) to q
will leave the equation unchanged.  The situation is only a little
more complicated with the "original" formulation:

p*(y/x) - q = (n/m)*SQRT( (p-x)^2 + (q-y)^2 )

but the corresponding adjustments to (p,q) would proportionally change
both sides of the equation.

An indirect verification of equivalence can therefore be derived by
showing that the slopes of the lines as (p,q) approaches (x,y) are the
same for both equations, e.g. dp/dq = tan(a - b).

*   *   *   *   *   *   *   *   *   *   *   *   *   *   * 

Closer to your original request for an exact solution, I observe that
for the special case q = 0, a cubic equation can be formulated for x
with coefficients in p, r, and k = (n/m):

  p*(y/x) = k*SQRT( p^2 - 2px + r^2 )

After multiplying both sides by x and squaring:

  (p^2)*(r^2 - x^2) = (k^2)*(x^2)*(p^2 - 2px + r^2)

Therefore an "exact solution" for x can be given in this case using
Cardan's formula for roots of a cubic equation.  See for example:

http://answers.google.com/answers/threadview?id=433886

http://answers.google.com/answers/threadview?id=441538

*   *   *   *   *   *   *   *   *   *   *   *   *   *   * 

There would be a very simple exact solution if all of the lines (p,q)
to (x,y) described in the first section of this Comment were to
intersect at a common focus, say (0,f).  For then one has only to
intersect a line through (p,q) and (0,f) with the upper semi-circle
x^2 + y^2 = r^2 to find (x,y).

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 05 Jan 2005 11:17 PST

Mathtalk,

There would be a very simple exact solution if all of the lines (p,q)
to (x,y) described in the first section of this Comment were to
intersect at a common focus, say (0,f).  For then one has only to
intersect a line through (p,q) and (0,f) with the upper semi-circle
x^2 + y^2 = r^2 to find (x,y).

I agree that if all the lines (p,q) to (x,y) were to intersect at a
common point then the solution would be easy.  Unfortunatly this is
not the case for general values of n and m.

Your observation that "(p,q) may be replaced in either equation by any
point inside the circle on a line between (x,y) and (p,q)." could be
expoited to reduce the dimensionality of the problem if it could be
reformulated using this knowledge (something like a coordinate
transform or an equation using this property as the basis functions). 
This reduction in dimensionality may then
enable an algebriac solution.  

What do you think?

John.

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 05 Jan 2005 11:56 PST

Hi, John:

Yes, in a special case one can reduce the dimensionality to one (e.g q
= 0) and obtain a reduced degree polynomial problem with a "closed
form" solution.

However I still expect that with good initial approximations, the
numerical method will converge just as fast as any "exact" solution. 
I'm thinking of "projecting" points onto the x-axis (q=0) using an
approximate focal point (p=0) mostly as a technique for generating the
initial approximations.  I believe that for a "spherical" lens,
paraxial rays do have a fairly tight "focus"; it's the rays which fall
near |x| = r which are responsible for the majority of "spherical
aberration".  I think there is a critical angle:

  |y/x| = n/m

at which refraction changes to reflection that the math "mirrors" the
physical limitation in your model.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 07 Jan 2005 07:27 PST

Hi, John:

Here's my approach, still to be coded and tested.  I'm assuming n/m <
1 is the case of real interest.

Given an array of (p,q) points:

   . . . . . .
   . . . . . .
   . . . . . .

we solve (assuming these lie on one side of the y-axis, since
solutions are symmetric about that line) for the extreme corners and
midpoint, finding an approximate common focal point where the three
lines thru respective (x,y) and (p,q) nearly intersect.

Then use this approximate focal point (p',q') to produce initial
approximations (x',y') for each additional solution desired.  Two
Newton iterations should suffice to give all necessary accuracy; three
should converge to machine precision (digits of precision will double
with each iteration), assuming a fairly tight angular distribution
reasonably close to the central axis.

See my Request for Clarification above about the realistic size of the
angular range for the block of (p,q) points.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 09 Jan 2005 16:21 PST

Hi, John:

A couple of quick notes:

I found a way to show that the Snell's Law and Fermat's principle
formulations are equivalent for q < y.

I coded and tested Newton iterations for both versions.  They have
about the same operation count, and for n/m < 1, both seem to converge
satisfactorily within 2 or 3 iterations if given starting values of
only modest accuracy.

Based on the limited amount of testing done so far, the method based
on Fermat's principle seems to converge a little faster.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: mathtalk-ga on 20 Jan 2005 14:20 PST

The polynomial for x or y turns out to be sextic (of sixth degree),
but it seems reasonable that by choosing a different unknown the
problem may be restated in a form of lesser degree.  One suggestion of
this has already been seen when, by restricting q = 0, one obtains a
formulation in which x satisfies a cubic polynomial (whose
coefficients depend algebraically on p) and in which p satisfies a
quadratic polynomial (whose coefficients depend algebraically on x).

I'm especially interested in a formulation in terms of the y-intercept
(0,-f) of the line passing through (p,q), rather than in terms of
point (x,y) where the line intercepts the (unit) circle.  It seems to
me that the left/right symmetry of the problem may produce a reduction
in the degree of the problem.

For example, suppose that the coordinate -f of the y-intercept
satisfies an even polynomial of degree 6 (whose coefficients depend
algebraically on p,q).  The square of f must then satisfy a
corresponding cubic equation, so an explicit formula for f (in terms
of root extractions and field operations) could in principle be
written down.

If k = n/m, the dependence of f (which I've written with a negative
sign above because the light path incident on the upper semi-circle
will have y-intercept in the lower half-plane) on x,y can be stated:
                __________
        k�y + k? 1 - k�x�
  f  = --------------------
             (1 - k�)

This gives values of f that range from:
         ____
  f = k/?1-k�   when (x,y) = (1,0)

to:

  f = k/(1-k)   when (x,y) = (0,1)

Even if this line of investigation doesn't produce the sought after
closed-form solution, it may still be useful in providing an
"approximate focal point".  That is, if we take an approximate
y-intercept (0,-f) and project through (p,q) to the point on the unit
circle (x,y), we may get a very satisfactory starting value for a
Newton iteration.

regards, mathtalk-ga

Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f
From: johnswanstone-ga on 27 Jan 2005 22:00 PST

Mathtalk,

I agree with your "gut" feeling that if I were just smart enough to
know how to reformulate the problem, I could get an algebriac
solution.

I'm thinking that your "y-intercept" idea is a good one.  A coordinate
system based on a point on the y-axis and an angle from vertical would
seem to exactly encapsulate the free variables in this problem
assuming of course that n/m < 1 (i.e. all things bend to the y-axis).

John.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.

Search Google Answers for

Google Home - Answers FAQ - Terms of Service - Privacy Policy

Hi, John: As you've outlined, a vertical beam of light strikes a point on an upper semi-circle: x� + y� = r�, y > 0 where the index of refraction changes from n above the semi-circle to m below. According to Snell's Law, the light will deflect from a straight path toward the normal line when n < m and away from it when n > m. Given your application of correcting images of the eye, as light passes from air (index of refraction ~ 1) to the cornea, lens, and aqueous media beyond (index of refraction > 1), we will primarily treat the case where n < m. We will refer to ratio k = n/m rather than to the separate indices. By choosing units of length the radius of the semi-circle can be assumed r = 1. Applying Fermat's principle you obtained for point (p,q) within the upper semi-circle: py - qx ------- = k SQRT((p-x)� + (q-y)�) x which we wish to solve for (x,y). Polynomial equation and extra roots ----------------------------------- In deriving a polynomial from this and the equation x� + y� = 1, a number of "artifact" roots are introduced, ie. values x satisfying that polynomial formulation but not the original one. These extra roots are added by steps that multiply both sides by nonconstant polynomials or that square both sides of an equation. For example the equation x = 1 becomes x� = 1 when both sides are squared and acquires a "false root" x = -1 in the process. Nonetheless the desired root x is among those which satisfy: ax^6 + bx^5 + cx^4 + dx^3 + ex^2 + f = 0 where: a = 4(p� + q�)(k�)� b = -4((k� + 1)(p� + q� + 1) - 1)k�p c = (k� - 1)(q�)� + 2(((k�)� + 1)(p� - 1) - (k� - 1))q� + ((k� + 1)p�)� + 2(k� + 1)k�p� + (k�)� d = 4k�p(p� + 2q�) e = -2((k� + 1)(p� + q� + 1) - 1)p� f = (p�)� Note that while both odd and even powers of p appear among these coefficients, only even powers of q are present. This explains some of the artifact roots. Where a solution x for (p,q) in the upper semi-circle exists, another solution x for mirror-image point (p,-q) in the lower semi-circle may also exist. This accounts for an extra root which exists for the sample data: p = 0.5 q = 0.1 k = (1.0/1.5) = 2/3 While the refracted path through the point on the unit circle: x = 0.68024533659318476 y = 0.73298450327631404 passes through the point (p,q), this artifact solution: x = 0.75808380947988874 y = 0.65215714195618513 passes through (p,-q). Similarly k only appears in even powers in the polynomial, and thus the other two real "solutions" actually satisfy for (p,q) or (p,-q): py - qx ------- = -k SQRT((p-x)� + (q-y)�) x It is easy to see that these last artifact roots were introduced by squaring both sides of the Fermat's principle formulation. Impossibility of solution by radicals ------------------------------------- Earlier in an RFC I reported the result of using MAXIMA and PARI/GP to determine if the particular sixth-degree polynomials produced in this application can be solved by radicals, ie. by algebraic formulas that require roots and fields operations acting on the coefficients of the polynomial. The result was that it is impossible to do so for the particular case p = 0.5, q = 0.1, k = 2/3 which you had used, and therefore impossible in general for these sixth-degree polynomials. However we did note that for the case q = 0, the resulting polynomial for x is only cubic and thus can be solved by radicals. Snell's Law formulation ----------------------- A third formulation of the problem is possible using Snell's Law. By a bit of trigonometry: p-x ----- = tan(a - b) q-y where a is the angle of vertical to the normal on the circle at (x,y) and b is the angle of refracted path to the same normal. Snell's Law says that: n sin(a) = m * sin(b) and from the relations sin(a) = x, cos(a) = y, plus: sin(b) = kx cos(b) = SQRT(1 - k�x�) one can rewrite tan(a - b) = sin(a - b)/cos(a - b) in terms of (x,y): p - x (1 - k�)x ----- = --------------------- q - y y + SQRT(k� - (k�x)�) If we think about both sides as a difference quotient (slope formula) involving (x,y) and one other point, this can be interpreted as saying that (p,q) lies on the straight line connecting the two points: (x,y) and (k�x, -SQRT(k� - (k�x)�)) Note that while (x,y) lies on the upper semi-circle, the latter point falls into the lower half-plane because of the negative y-coordinate. It's possible then to determine for (x,y) on the upper semi-circle the y-intercept of the refracted path is (0,-f), where: f = ( k�y + SQRT(k� - (k�x)�) )/(1 - k�) This "focal" f has the limit of k/(1-k) as points (x,y) tend to (0,1) and another limit k/SQRT(1 - k�) as (x,y) tends to (1,0) or (-1,0). Any point (x,y) on the upper semi-circle will have its refracted path through (0,-f) for some f intermediate between these limits. Newton iterations ----------------- Given a differentiable function F(x) and an initial guess x for a root of F(x) = 0, we compute successive Newton approximations: x := x - F(x)/F'(x) Any of the three formulations could be used as the basis for a Newton iteration. The polynomial form would make it easy in some respects to code the computation of the polynomial and its derivative together in a common loop over the coefficients. However the polynomial form has two drawbacks in this application. It has artifact roots that we'd need to be on guard against, and its coefficients require a greater amount of calculation from (p,q) each time they change than other formulations. The subject of starting values will be taken up in the next section. For the moment we will concern ourselves with the calculations needed for the Newton iteration itself. The Fermat principle can be reformulated in a variety of ways, even into the Snell's law form (and vice versa). So we don't claim that the particular choice of equation made here is a necessary one. Clearing denominators and moving all terms to one side produces: F(x) = py - (q + kd)x = 0 where d = SQRT((p-x)� + (q-y)�). Differentiating with respect to x: F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q - y)y')/d. Alternatively we can use the Snell's law formulation in a similar way, clearing the denominators and grouping all terms on one side: G(x) = (p - x)(y + z) - (1 - k�)(q - y)x = 0 where z = SQRT(k� - (k�x)�). Differentiating with respect to x: G'(x) = (p - x)(y' + z') - (y + z) - (1 - k�)(q - y - y'x) where y' is as above and z' = -(k�)�x/z. Starting values --------------- Once a Newton iteration is available for a problem that requires modest computation for an exact derivative, as here, the main difficulty left is to settle upon initial guesses that are good enough to allow the asymptotic quadratic-convergence to quickly appear. Here we have a couple of reasonable starting values that bracket the true root which are nearly "free" in terms of computation required. The point (p,SQRT(1 - p�)) would lie on the straight path through (p,q) if k were 1, ie. if there were no refraction. On the other hand ( p/SQRT(p�+q�), q/SQRT(p�+q�) ) would lie on the line through (p,q) and the origin, which is the limiting path (the normal line) if k were 0. It is easy to show that the functions F(x) and G(x) both have a change in sign between these two points on the upper semi-circle, thus proving that a root exists between them in each case. A simple linear interpolation between these values may be the best choice, certainly in the case that (p,q) is close to either the semi-circle or to the y-axis. One can construct narrower bounds using a different pair of lines passing through (p,q), at a cost of some additional computation. For example, if we take the limiting points (0,-f) found earlier: f = k/(1 - k) and k/SQRT(1 - k�) then lines from these two y-intercepts through (p,q) will intersect the unit semi-circle in a pair of points that more tightly bracket the root we want. Another thought here is that instead of using linear interpolation between the respective values of F(x) or G(x) at such a bracketing pair to obtain an initial guess for the Newton iteration, we can fairly easily find the intersection of the two refracted paths from those two points in the lower half-plane, then project from that "false focus" through (p,q) to a "guess" on the unit semi-circle. regards, mathtalk-ga
Request for Answer Clarification by johnswanstone-ga on 01 Mar 2005 14:24 PST mathtalk, You said: On a more positive note, I can provide a numerical method which converges quickly and robustly for every (p,q) in the upper semi-circle: p^2 + q^2 < r^2 q > 0 implemented in C (for example), subject to the assumption n/m < 1. regards, mathtalk-ga So where is this code? John.
Clarification of Answer by mathtalk-ga on 02 Mar 2005 20:46 PST Hi, John: Quite right, I promised C code and I'll deliver it. regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 02 Mar 2005 22:27 PST For starters here's some basic C code to run the Newton iterations for both the Fermat and Snell's Law formulations. I picked a fixed error tolerance, but if it were desirable we could make that a parameter for both routines. When p = 0, the explicit answer x = 0 is known, so we simply return that. For both routines p,q are the coordinates of the point inside the circle, normalized by dividing them by the circle's radius as discussed previously. In the case of Fermat Principle formulation, k = n/m. For Snell's Law we supply k2 as k� for convenience. In both cases parameter x is a suitable initial guess. Next I provide a test harness that sets initial values and determines the speed with which each formulation can solve the equation. #include <math.h> double newton_fermat( double p, double q, double k, double x) { double y, dydx, d, dddx, f, dfdx, xd; double tol = 1.e-7; /* tolerance could be a parameter / if (p == 0.) return 0.0; / F(x) = py - (q + kd)x = 0 where y = SQRT(1 - x�) and d = SQRT((p-x)� + (q-y)�) F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q-y)y')/d / while(1) { y = sqrt(1. - xx); d = sqrt((p-x)(p-x) + (q-y)(q-y)); f = py - (q + kd)x; dydx = -x/y; dddx = -(p - x + (q-y)dydx)/d; dfdx = pdydx - (q + kd) - kxdddx; xd = -f/dfdx; if (fabs(xd) < tol) return x + xd; x += xd; } } double newton_snell( double p, double q, double k2, double x) { double y, dydx, z, dzdx, g, dgdx, xd; double tol = 1.e-7; /* tolerance could be a parameter / if (p == 0.) return 0.0; / G(x) = (p-x)(y+z) - (1-k�)(q-y)x = 0 where y = SQRT(1 - x�) and z = SQRT(k� - (k�x)�) G'(x) = (p-x)(y'+z') - (y+z) - (1-k�)(q - y - y'x) where y' = -x/y and z' = -(k�)�x/z / while(1) { y = sqrt(1. - xx); z = sqrt(k2 - (k2x)(k2x)); g = (p-x)(y+z) - (1-k2)(q-y)x; dydx = -x/y; dzdx = -k2k2x/z; dgdx = (p-x)(dydx + dzdx) - (y+z) - (1-k2)(q - y - dydx*x); xd = -g/dgdx; if (fabs(xd) < tol) return x + xd; x += xd; } } regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:37 PST Before writing the C code to construct an initial guess for Newton's method, let me present the formulas for some of the underlying concepts. We kicked around the idea before of an "approximate" focus as light paths refracted by the upper semi-circle intersect the y-axis. Using the Snell's Law formulation, it is possible to express this y-intercept (0,-f) either in terms of x or y: (p - x)(y + SQRT(k� - (k�x)�)) - (1 - k�)(q - y)x = 0 Since this equation holds for every point (p,q) on the refracted path from (x,y), we can let p approach zero and q approach -f, and after factoring out x: (y + SQRT(k� - (k�x)�)) + (1 - k�)(f + y) = 0 A bit of algebra yields: k�y + SQRT(k� - (k�x)�) f = ------------------------ 1 - k� It is convenient to introduce constant C = 1/k� - 1, allowing us to write: f = (y + SQRT(y� + C))/C Furthermore this relation can be inverted, giving y in terms of f: y = (Cf - 1/f)/2 We should bear in mind that y,f > 0 in the context of our problem (having chosen (0,-f) as the notation for the y-intercept because it falls below the x-axis). Also the sign of x = � SQRT(1 - y�) is chosen to agree with the sign of p. Given that we can easily translate between locating a point (x,y) on the upper semi-circle and the corresponding "focus" (0,-f) on the lower y-axis, a more flexible approach to finding an initial guess may be sought. First note that if we have points (x_a,y_a) and (x_b,y_b) that "bracket" the solution (x,y) corresponding to (p,q) on the same side of the y-axis as (p,q), then their "foci" (0,-f_a) & (0,-f_b) will bracket the "true" intercept (0,-f). That is: if p > 0, 0 < x_a < x < x_b <==> 0 < f_b < f < f_a Earlier we discussed one choice of "bracketing" points: (x_a,y_a) = (p, SQRT(1 - p�)) (x_b,y_b) = (p/SQRT(p�+q�),q/SQRT(p�+q�)) These are respectively the results of drawing a vertical line and a radial line through (p,q) and finding the intersection with the upper semi-circle. Alternatively we can find where the line from some point (0,-s) on the lower y-axis through (p,q) intersects the upper semi-circle. If s = f, we get the exact solution (x,y) in this way. Again some formulas are needed. Since the slope of this line is m = (q+s)/p, we have to solve: y = mx - s and x� + y� = 1 being careful to distinguish which quadratic root is on the upper semi-circle. The answer turns out to be: x = (ms � SQRT(m� - s� + 1))/(m� + 1) with the sign � chosen to agree with p. Instead of the vertical and radial lines, which would correspond to choosing s to be +oo or zero, we might use: s_0 = k/(1-k) and s_1 = k/SQRT(1 - k�) which represent the limiting extremes of the y-intercepts. I.e., refracted paths which start close to the y-axis, x = 0, will intersect the y-axis slightly above (0,-s_0), while paths that start nearly at the endpoints of the upper semi-circle are refracted more sharply and intersect slightly below (0,-s_1). For the sake of illustration, here is how one might construct an improved initial guess using these ideas. The Snell's Law formulation can be restated: G(x) = g(s) = p(s+y) - (s+q)x = 0 where y = (Cs - 1/s) and x = sign(p) SQRT(1 - y�). Interpolating between the values (s_0,g(s_0)) and (s_1,g(s_1)) gives an improved approximation s for "exact" intercept f, and locating the intersection of the line (0,-s) through (p,q) with the upper semi-circle gives an improved initial guess for (x,y). My goal to calculate an improved initial guess without expending more effort than roughly that for one Newton iteration, so that we jump into the mode of its quadratic convergence rate as soon as practical. regards, mathtalk-ga
Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:57 PST Oops, I left out a division by 2 in the penultimate paragraph of my last note. A line there should have read: where y = (Cs - 1/s)/2 and x = sign(p) SQRT(1 - y�)... instead of: where y = (Cs - 1/s) and x = sign(p) SQRT(1 - y�). -- mathtalk-ga