View Question
Q: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f ( Answered,   15 Comments )
 Question
 Subject: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f Category: Science > Math Asked by: johnswanstone-ga List Price: \$100.00 Posted: 31 Dec 2004 22:30 PST Expires: 30 Jan 2005 22:30 PST Question ID: 449910
 ```An algebraic solution is preferred as I'm going to have to solve this problem hundreds of thousands of times with different coefficients. Here are the coefficients: a = -4*n^4*(p^2+q^2) b = 4*n^2*p*(m^2*p^2+n^2*p^2+m^2*q^2+n^2*q^2+n^2*r^2) c = -m^4*p^4-n^4*p^4-2*m^2*n^2*p^4-m^4*q^4-n^4*q^4+ 2*m^2*n^2*q^4-2*m^4*p^2*q^2-2*n^4*p^2*q^2-n^4*r^4- 2*n^4*p^2*r^2-2*m^2*n^2*p^2*r^2+2*n^4*q^2*r^2+ 2*m^2*n^2*q^2*r^2 d = -4*m^2*n^2*p*r^2*(p^2+2*q^2) e = 2*m^2*p^2*r^2*(m^2*p^2+n^2*p^2+m^2*q^2+n^2*q^2+n^2*r^2) f = -m^4*p^4*r^4 where m,n,p,q,r are arbitrary real number constants subject to the following constraints: n>0 m>0 r>0 -r in the grid I ask what is the value on the surface of the eye that would cause the laser beam to reflect into . If I get lucky this will correspond to a single column in my data grid. Now using the distance between and I can find which data sample that corresponds to that in the eye. If I'm unlucky (which I will be most of the time) I can find the pair of data columns that bracket the point and interpolate the number I'm looking for from there. Ok, now here?s the analysis I?ve done to date: Let?s start with a hemisphere of radius r centered at the origin and extending from ?r to r on the x-axis and from 0 to r on the y-axis. Let?s assume that everything outside the hemisphere has an index of refraction n and everything inside has an index of refraction m. Let?s pick any point on the hemisphere and any point inside the hemisphere. The time it takes light to travel from the point to is: n/c * (r-y) The time it takes light to travel from to is: m/c * sqrt((x-p)^2 + (y-q)^2)) And the total time is the sum of these two terms. Knowing that x^2+y^2=r^2 let?s us substitute and eliminate x in the prior equation. Now applying Fermat?s Principle, which is dt/dy = 0 let?s us differentiate the substituted equation of time with respect to y and set it?s value to 0. A bit of algebra leads us to the following equation: m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) Squaring both sides and a bit of algebra leads to: [(m^2*p^2)/n^2]* y^2/x^2 ? [2*(m^2*p*q)/n^2] * y/x + [2*p] * x + [2*q] * y + [(m^2*q^2)/n^2] - p^2 - q^2 - r^2 = 0 We still know that: x^2 + y^2 = r^2 So we have 2 equation and 2 unknowns. Substituting y = sqrt(r^2 ? x^2) and squaring both sides again and quite a bit of algebra leads to the original 6th degree polynomial. For those you who are adventurous: Continuing with the original problem description ? Next I have to repeat this process for each boundary within the eye. The problem is that all boundaries other than the first need to be represented by splines (which are piece-wise polynomials with the property of being differential in the 2nd degree at the piece-wise boundaries). My thoughts are that I could solve for on each polynomial similarly to what I?ve describe above for the hemisphere and exclude the solution if it is not in the "piece" of the spline that polynomial defines. Any thoughts? Suggestions? Ok. Thanks for getting this far. I hope to hear from you soon. J.``` Clarification of Question by johnswanstone-ga on 31 Dec 2004 22:33 PST ```Where I say: "I ask what is the value on the surface of the eye that would cause the laser beam to reflect into ." I really mean: "I ask what is the value on the surface of the eye that would cause the laser beam to refract into ."``` Request for Question Clarification by mathtalk-ga on 01 Jan 2005 14:05 PST ```Hi, johnswanstone-ga: A sixth degree polynomial has six roots in the complex plane counting multiplicity. However your application would suggest you are interested in only one (real) solution for x. If this is the case, how do you distinguish the one real root which is of interest from those which are "artifacts"? From your description the parameters p,q vary constantly and parameters r,m,n vary seldom. If the parameters p,q vary by slight amounts from a previous "setting", then it is especially attractive to use a numerical solution using the prior solution as an initial approximation. I believe that if you are only interested in a single real solution which can be succinctly identified (per my first paragraph) then a numerical solution will likely outperform any "exact" symbolic solution in terms of speed. Indeed, because your problem appears to have five independent parameters, it is unlikely that your general case is more special than the general sixth-order polynomial, whose "symbolic" solution requires operations that provide solutions to these problems: x^6 + x = A x^5 + x = B along with solutions to dozens of polynomial "subproblems". Again, your description of the underlying application makes it sound as though for given values of r,m,n, you will vary p,q within the circular domain: p^2 + q^2 < r^2 to solve for corresponding x,y values. The physical interpretation suggests that x,y will vary continuously with p,q (while r,m,n are fixed), and thus that with the possible exception of the first solution, an excellent initial "guess" for the desired solution is available as the result of a previous problem. In the absence of a "multiple" root, a numerical method such as Newton's will converge very rapidly starting from a sufficiently accurate initial approximation. One further request for clarification: Is it possible that degree of the polynomial formulation would be lower if both x,y were retained in the equation(s)? However if you are truly interested in knowing just what an "exact" solution to the stated problem looks like, then I will be happy to stick with that approach. regards, mathtalk-ga``` Clarification of Question by johnswanstone-ga on 01 Jan 2005 15:38 PST ```mathtalk, Thanks for your reply. It is true that I'm only looking for a single real solution to this problem. The rest of the solutions are artifacts caused by the "squaring of both sides" operations I did (at least this is what I think it is caused by). I agree that will be continious with changing and I agree with your analysis that a numerical solution could be faster for finding from a given solution. The problem is ... how do you find the initial solution? And I'd really like to be able to compare the numerical to the algebriac solution. I also would like to know how to find the algebriac solution for my own education. Discovering which root is the "correct" one for my problem is simple enough. Just see which root value makes following equation true: m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) For example, setting: n=1 m=1.5 r=1 q=0.1 p=0.5 yields the following four real roots: x=-0.55208096577064258 and y=0.83379050560302892 x=-0.44209591096428252 and y=0.89696778398594745 x=0.68024533659318476 and y=0.73298450327631404 x=0.75808380947988874 and y=0.65215714195618513 plugging them into the above equation shows that x=0.68024533659318476 is the one that I'm really looking for. All the others don't balance the equation (which I find to be pretty strange. Maybe you can expain it). John.``` Clarification of Question by johnswanstone-ga on 03 Jan 2005 06:17 PST ```mathtalk, Are you working on this? Is there any progress to report? Should I look elsewhere for an answer? hello, is there anybody out there? john.``` Request for Question Clarification by mathtalk-ga on 03 Jan 2005 08:59 PST ```Hi, johnswanstone-ga: I am working on your problem, specifically to find solutions of the nonpolynomial system directly. Looking at the "original" formulation: m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) x^2 + y^2 = r^2 I'm wondering what "solution" you desired for p = q = 0 ? Is this an exceptional case, for which no solution is possible? If I'm parsing the equation correctly, it amounts to 0 = nr. However for certain values of m,n there could be a "limiting" solution x = 0, y = r. regards, mathtalk-ga``` Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:22 PST ```Hi, johnswanstone-ga: When you wrote, "The equation of a circle in terms of an angle, a: x = r*sin(a) y = r*sin(b)" did you mean x = r*cos(a), y = r*sin(a), so that x^2 + y^2 = r^2 ? regards, mathtalk-ga``` Request for Question Clarification by mathtalk-ga on 03 Jan 2005 18:27 PST ```Never mind, I see from the substitution step that you are taking: x = r*sin(a) y = r*cos(a) which makes your angle a the "complement" of my angle t. regards, mathtalk-ga``` Request for Question Clarification by mathtalk-ga on 04 Jan 2005 19:50 PST ```Hi, John: I'm having some doubt about how you've applied Fermat's principle. We can agree that to find the path between two fixed points, Fermat's principle selects the minimum travel time. However in your application the path from (x,r) to (p,q) only involves one fixed point, namely (p,q). The point (x,r) depends on the variable y. I'm trying to reconcile the version obtained from Snell's Law with the "original" formulation. I'll feel more confident in coding the solution if I know these are equivalent. regards, mathtalk-ga``` Clarification of Question by johnswanstone-ga on 05 Jan 2005 12:08 PST ```Mathtalk, Given a point and applying Fermat's Principle leads to minimization of a function in time which leads to the roots of a 6th order polynomial expressed in either x or y (each having different coefficients of course). Once either x or y is found the other can be immediatly calculated by x^2 + y^2 = r^2 and once y is known is also known since r is a constant (i.e. input variable). In my example: n=1 m=1.5 r=1 q=0.1 p=0.5 the solution is: x=0.68024533659318476 and y=0.73298450327631404 which can be verified by: solving for a using: x = r * sin(a) which gives: a = 0.74809729181091122 and b using: b = arcsin( (n/m) * sin(a) ) which gives: b = 0.47068498620621618 and finally verifying this equation is true: (x-p) = tan(a - b) * (y-q) which gives: 0.18024533659318487 = 0.18024533659318503 John.``` Request for Question Clarification by mathtalk-ga on 07 Jan 2005 06:24 PST ```I see that for your sample problem the Fermat principle and Snell's Law approach agree on the solution, so I'm assuming that there's an equivalence there. But I've been unable to prove it. On another practical note, what is the angular range of the set of (p,q) points you will need to solve "at one blow"? I'm guessing that a fully dilated pupil/iris will be less than a 30 degree aperture, but that from your description your field of illumination is intended to be quite a bit smaller than this. So the question is whether we are trying to quickly find 256x512 solutions whose angular range is more like 10 degrees or like 1 degree? Or even less? regards, mathtalk-ga``` Clarification of Question by johnswanstone-ga on 07 Jan 2005 13:28 PST ```Mathtalk, I'm not really sure of what angular field my data encompasses since I haven't actually fit a circle to it and measured the angle. But if you're curious, I can show you a raw data image. http://www.frontiernet.net/~jfss/JAOSnormal995bk.%20%20%202.BMP John.``` Request for Question Clarification by mathtalk-ga on 08 Jan 2005 08:25 PST ```Hi, John: In the image shown the angular measurement of the outer circular edge is about 67.5 degrees. My question has to do with the "patch" of p,q coordinates for which you want a split-second computation. Since you described these points as being a 256x512 point array inside the radius of the outer circle, I thought they were perhaps a localized part of the interior, not extending over the entire image. regards, mathtalk-ga``` Clarification of Question by johnswanstone-ga on 09 Jan 2005 21:12 PST ```Mathtalk, Well, My job is to perform refraction correction on the source image. So the first thing I need to do is correct the refraction distortion caused by the initial air/eye boundary. This will correct all the data inside the cornea (assuming the eye has a uniform index of refraction) and then perform the correction again for the back surface of the cornea using the results of the first correction as input. The 256x512 "patch" was a rough approximation for the number of pixels in the image I need to correct. I understand that since the refraction of the light rays will vary continuously with the distance from the center of the "circle" that solutions for points that are "close" to each other will also be "close" to each other and that provides for very fast convergence of a numerical root finding method. What I don't know how to do is find the initial solution to start the whole thing off. How do you plan on doing that? John.``` Request for Question Clarification by mathtalk-ga on 10 Jan 2005 20:32 PST ```Hi, John: In my experiments convergence was quick even if x = 0 was used as the starting approximation. However I don't recommend that because it is possible, especially if q < 0, for the first step of the method to "overshoot" the boundary x = 1, so it's prudent to be a little more careful in arranging the computation. Here's one approach that I think will work nicely. Take a closely spaced "horizontal" line of points (p,q) sharing a common value of q. Start with the point where |p| is closest to 0 and work outward, using x = p for the first point's initial guess and the converged result of each point as the starting value for the next point. Since the exact solution is x = 0 when p = 0, x = p is an excellent starting value close to the y-axis. Alternatively one could take a "vertical" line of points and start from the top (closest to the circle), also using x = p as the first point's initial guess. Since the refraction near the "lens surface" is slight, again this is an excellent approximation. From what I've tried I'd say the Fermat principle iteration works best when your intial guesses underestimate x, and the Snell's Law iteration works best when the initial guesses overestimate x. The suggestions above for x = p are of course underestimates (assuming n/m < 1). The only thing I'd be worried about in recommending solely relying on the Fermat formulation is if for some cases you need to work with |x| close to 1. However, given the application parameters discussed earlier (an angular range probably within +45 to -45 degrees), I don't think this is really an issue. My rough operation count for each Newton iteration is: 10 multiplications 11 additions 3 divisions 2 square roots With a bit of thought one or two of the multiplications per iteration can probably be "amortized" over the convergence sequence, but since this is at worst two or three iterations with starting values of modest accuracy, there's not a whole lot to save. If you need to solve this for every pixel more or less, even a single Newton iteration at each point is probably suboptimal. The Newton iterations can be done at more widely spaced points, filling in between them with either a linear or an inverse quadratic interpolation. If your hardware platform has a floating point unit, the ultimate performance will be eked out by "coding to the metal" and assigning floating point registers to hold specific intermediate computations, so that all or most of them involve only register operands. regards, mathtalk-ga``` Request for Question Clarification by mathtalk-ga on 21 Jan 2005 20:35 PST ```Hi, johnswanstone-ga: Using MAXIMA and PARI/GP, I calculated the Galois group of the sixth degree polynomial for x which results from the parameters: p = 0.5, q = 0.1, n = 1, m = 1.5, r = 1 as in the example you worked with previously. After scaling one obtains a monic irreducible polynomial: x^6 - 738*x^5 + 135226*x^4 + 10513152*x^3 - 3735673344*x^2 + 24637221273600 whose roots are an integer multiple 208 times the roots of your original problem. The Galois group is S_6, the full symmetric group on the six roots. Consequently the root x cannot be expressed in terms of only roots and rational expressions based on the integers. On a more positive note, I can provide a numerical method which converges quickly and robustly for every (p,q) in the upper semi-circle: p^2 + q^2 < r^2 q > 0 implemented in C (for example), subject to the assumption n/m < 1. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f Answered By: mathtalk-ga on 30 Jan 2005 22:10 PST
 ```Hi, John: As you've outlined, a vertical beam of light strikes a point on an upper semi-circle: x² + y² = r², y > 0 where the index of refraction changes from n above the semi-circle to m below. According to Snell's Law, the light will deflect from a straight path toward the normal line when n < m and away from it when n > m. Given your application of correcting images of the eye, as light passes from air (index of refraction ~ 1) to the cornea, lens, and aqueous media beyond (index of refraction > 1), we will primarily treat the case where n < m. We will refer to ratio k = n/m rather than to the separate indices. By choosing units of length the radius of the semi-circle can be assumed r = 1. Applying Fermat's principle you obtained for point (p,q) within the upper semi-circle: py - qx ------- = k SQRT((p-x)² + (q-y)²) x which we wish to solve for (x,y). Polynomial equation and extra roots ----------------------------------- In deriving a polynomial from this and the equation x² + y² = 1, a number of "artifact" roots are introduced, ie. values x satisfying that polynomial formulation but not the original one. These extra roots are added by steps that multiply both sides by nonconstant polynomials or that square both sides of an equation. For example the equation x = 1 becomes x² = 1 when both sides are squared and acquires a "false root" x = -1 in the process. Nonetheless the desired root x is among those which satisfy: ax^6 + bx^5 + cx^4 + dx^3 + ex^2 + f = 0 where: a = 4(p² + q²)(k²)² b = -4((k² + 1)(p² + q² + 1) - 1)k²p c = (k² - 1)(q²)² + 2(((k²)² + 1)*(p² - 1) - (k² - 1))q² + ((k² + 1)p²)² + 2(k² + 1)k²p² + (k²)² d = 4k²p(p² + 2q²) e = -2((k² + 1)(p² + q² + 1) - 1)p² f = (p²)² Note that while both odd and even powers of p appear among these coefficients, only even powers of q are present. This explains some of the artifact roots. Where a solution x for (p,q) in the upper semi-circle exists, another solution x for mirror-image point (p,-q) in the lower semi-circle may also exist. This accounts for an extra root which exists for the sample data: p = 0.5 q = 0.1 k = (1.0/1.5) = 2/3 While the refracted path through the point on the unit circle: x = 0.68024533659318476 y = 0.73298450327631404 passes through the point (p,q), this artifact solution: x = 0.75808380947988874 y = 0.65215714195618513 passes through (p,-q). Similarly k only appears in even powers in the polynomial, and thus the other two real "solutions" actually satisfy for (p,q) or (p,-q): py - qx ------- = -k SQRT((p-x)² + (q-y)²) x It is easy to see that these last artifact roots were introduced by squaring both sides of the Fermat's principle formulation. Impossibility of solution by radicals ------------------------------------- Earlier in an RFC I reported the result of using MAXIMA and PARI/GP to determine if the particular sixth-degree polynomials produced in this application can be solved by radicals, ie. by algebraic formulas that require roots and fields operations acting on the coefficients of the polynomial. The result was that it is impossible to do so for the particular case p = 0.5, q = 0.1, k = 2/3 which you had used, and therefore impossible in general for these sixth-degree polynomials. However we did note that for the case q = 0, the resulting polynomial for x is only cubic and thus can be solved by radicals. Snell's Law formulation ----------------------- A third formulation of the problem is possible using Snell's Law. By a bit of trigonometry: p-x ----- = tan(a - b) q-y where a is the angle of vertical to the normal on the circle at (x,y) and b is the angle of refracted path to the same normal. Snell's Law says that: n * sin(a) = m * sin(b) and from the relations sin(a) = x, cos(a) = y, plus: sin(b) = kx cos(b) = SQRT(1 - k²x²) one can rewrite tan(a - b) = sin(a - b)/cos(a - b) in terms of (x,y): p - x (1 - k²)x ----- = --------------------- q - y y + SQRT(k² - (k²x)²) If we think about both sides as a difference quotient (slope formula) involving (x,y) and one other point, this can be interpreted as saying that (p,q) lies on the straight line connecting the two points: (x,y) and (k²x, -SQRT(k² - (k²x)²)) Note that while (x,y) lies on the upper semi-circle, the latter point falls into the lower half-plane because of the negative y-coordinate. It's possible then to determine for (x,y) on the upper semi-circle the y-intercept of the refracted path is (0,-f), where: f = ( k²y + SQRT(k² - (k²x)²) )/(1 - k²) This "focal" f has the limit of k/(1-k) as points (x,y) tend to (0,1) and another limit k/SQRT(1 - k²) as (x,y) tends to (1,0) or (-1,0). Any point (x,y) on the upper semi-circle will have its refracted path through (0,-f) for some f intermediate between these limits. Newton iterations ----------------- Given a differentiable function F(x) and an initial guess x for a root of F(x) = 0, we compute successive Newton approximations: x := x - F(x)/F'(x) Any of the three formulations could be used as the basis for a Newton iteration. The polynomial form would make it easy in some respects to code the computation of the polynomial and its derivative together in a common loop over the coefficients. However the polynomial form has two drawbacks in this application. It has artifact roots that we'd need to be on guard against, and its coefficients require a greater amount of calculation from (p,q) each time they change than other formulations. The subject of starting values will be taken up in the next section. For the moment we will concern ourselves with the calculations needed for the Newton iteration itself. The Fermat principle can be reformulated in a variety of ways, even into the Snell's law form (and vice versa). So we don't claim that the particular choice of equation made here is a necessary one. Clearing denominators and moving all terms to one side produces: F(x) = py - (q + kd)x = 0 where d = SQRT((p-x)² + (q-y)²). Differentiating with respect to x: F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q - y)y')/d. Alternatively we can use the Snell's law formulation in a similar way, clearing the denominators and grouping all terms on one side: G(x) = (p - x)(y + z) - (1 - k²)(q - y)x = 0 where z = SQRT(k² - (k²x)²). Differentiating with respect to x: G'(x) = (p - x)(y' + z') - (y + z) - (1 - k²)(q - y - y'x) where y' is as above and z' = -(k²)²x/z. Starting values --------------- Once a Newton iteration is available for a problem that requires modest computation for an exact derivative, as here, the main difficulty left is to settle upon initial guesses that are good enough to allow the asymptotic quadratic-convergence to quickly appear. Here we have a couple of reasonable starting values that bracket the true root which are nearly "free" in terms of computation required. The point (p,SQRT(1 - p²)) would lie on the straight path through (p,q) if k were 1, ie. if there were no refraction. On the other hand ( p/SQRT(p²+q²), q/SQRT(p²+q²) ) would lie on the line through (p,q) and the origin, which is the limiting path (the normal line) if k were 0. It is easy to show that the functions F(x) and G(x) both have a change in sign between these two points on the upper semi-circle, thus proving that a root exists between them in each case. A simple linear interpolation between these values may be the best choice, certainly in the case that (p,q) is close to either the semi-circle or to the y-axis. One can construct narrower bounds using a different pair of lines passing through (p,q), at a cost of some additional computation. For example, if we take the limiting points (0,-f) found earlier: f = k/(1 - k) and k/SQRT(1 - k²) then lines from these two y-intercepts through (p,q) will intersect the unit semi-circle in a pair of points that more tightly bracket the root we want. Another thought here is that instead of using linear interpolation between the respective values of F(x) or G(x) at such a bracketing pair to obtain an initial guess for the Newton iteration, we can fairly easily find the intersection of the two refracted paths from those two points in the lower half-plane, then project from that "false focus" through (p,q) to a "guess" on the unit semi-circle. regards, mathtalk-ga``` Request for Answer Clarification by johnswanstone-ga on 01 Mar 2005 14:24 PST ```mathtalk, You said: On a more positive note, I can provide a numerical method which converges quickly and robustly for every (p,q) in the upper semi-circle: p^2 + q^2 < r^2 q > 0 implemented in C (for example), subject to the assumption n/m < 1. regards, mathtalk-ga So where is this code? John.``` Clarification of Answer by mathtalk-ga on 02 Mar 2005 20:46 PST ```Hi, John: Quite right, I promised C code and I'll deliver it. regards, mathtalk-ga``` Clarification of Answer by mathtalk-ga on 02 Mar 2005 22:27 PST ```For starters here's some basic C code to run the Newton iterations for both the Fermat and Snell's Law formulations. I picked a fixed error tolerance, but if it were desirable we could make that a parameter for both routines. When p = 0, the explicit answer x = 0 is known, so we simply return that. For both routines p,q are the coordinates of the point inside the circle, normalized by dividing them by the circle's radius as discussed previously. In the case of Fermat Principle formulation, k = n/m. For Snell's Law we supply k2 as k² for convenience. In both cases parameter x is a suitable initial guess. Next I provide a test harness that sets initial values and determines the speed with which each formulation can solve the equation. #include double newton_fermat( double p, double q, double k, double x) { double y, dydx, d, dddx, f, dfdx, xd; double tol = 1.e-7; /* tolerance could be a parameter */ if (p == 0.) return 0.0; /* F(x) = py - (q + kd)x = 0 where y = SQRT(1 - x²) and d = SQRT((p-x)² + (q-y)²) F'(x) = py' - (q + kd) - kxd' where y' = -x/y and d' = -(p - x + (q-y)y')/d */ while(1) { y = sqrt(1. - x*x); d = sqrt((p-x)*(p-x) + (q-y)*(q-y)); f = p*y - (q + k*d)*x; dydx = -x/y; dddx = -(p - x + (q-y)*dydx)/d; dfdx = p*dydx - (q + k*d) - k*x*dddx; xd = -f/dfdx; if (fabs(xd) < tol) return x + xd; x += xd; } } double newton_snell( double p, double q, double k2, double x) { double y, dydx, z, dzdx, g, dgdx, xd; double tol = 1.e-7; /* tolerance could be a parameter */ if (p == 0.) return 0.0; /* G(x) = (p-x)(y+z) - (1-k²)(q-y)x = 0 where y = SQRT(1 - x²) and z = SQRT(k² - (k²x)²) G'(x) = (p-x)(y'+z') - (y+z) - (1-k²)(q - y - y'x) where y' = -x/y and z' = -(k²)²x/z */ while(1) { y = sqrt(1. - x*x); z = sqrt(k2 - (k2*x)*(k2*x)); g = (p-x)*(y+z) - (1-k2)*(q-y)*x; dydx = -x/y; dzdx = -k2*k2*x/z; dgdx = (p-x)*(dydx + dzdx) - (y+z) - (1-k2)*(q - y - dydx*x); xd = -g/dgdx; if (fabs(xd) < tol) return x + xd; x += xd; } } regards, mathtalk-ga``` Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:37 PST ```Before writing the C code to construct an initial guess for Newton's method, let me present the formulas for some of the underlying concepts. We kicked around the idea before of an "approximate" focus as light paths refracted by the upper semi-circle intersect the y-axis. Using the Snell's Law formulation, it is possible to express this y-intercept (0,-f) either in terms of x or y: (p - x)(y + SQRT(k² - (k²x)²)) - (1 - k²)(q - y)x = 0 Since this equation holds for every point (p,q) on the refracted path from (x,y), we can let p approach zero and q approach -f, and after factoring out x: (y + SQRT(k² - (k²x)²)) + (1 - k²)(f + y) = 0 A bit of algebra yields: k²y + SQRT(k² - (k²x)²) f = ------------------------ 1 - k² It is convenient to introduce constant C = 1/k² - 1, allowing us to write: f = (y + SQRT(y² + C))/C Furthermore this relation can be inverted, giving y in terms of f: y = (C*f - 1/f)/2 We should bear in mind that y,f > 0 in the context of our problem (having chosen (0,-f) as the notation for the y-intercept because it falls below the x-axis). Also the sign of x = ± SQRT(1 - y²) is chosen to agree with the sign of p. Given that we can easily translate between locating a point (x,y) on the upper semi-circle and the corresponding "focus" (0,-f) on the lower y-axis, a more flexible approach to finding an initial guess may be sought. First note that if we have points (x_a,y_a) and (x_b,y_b) that "bracket" the solution (x,y) corresponding to (p,q) on the same side of the y-axis as (p,q), then their "foci" (0,-f_a) & (0,-f_b) will bracket the "true" intercept (0,-f). That is: if p > 0, 0 < x_a < x < x_b <==> 0 < f_b < f < f_a Earlier we discussed one choice of "bracketing" points: (x_a,y_a) = (p, SQRT(1 - p²)) (x_b,y_b) = (p/SQRT(p²+q²),q/SQRT(p²+q²)) These are respectively the results of drawing a vertical line and a radial line through (p,q) and finding the intersection with the upper semi-circle. Alternatively we can find where the line from some point (0,-s) on the lower y-axis through (p,q) intersects the upper semi-circle. If s = f, we get the exact solution (x,y) in this way. Again some formulas are needed. Since the slope of this line is m = (q+s)/p, we have to solve: y = mx - s and x² + y² = 1 being careful to distinguish which quadratic root is on the upper semi-circle. The answer turns out to be: x = (ms ± SQRT(m² - s² + 1))/(m² + 1) with the sign ± chosen to agree with p. Instead of the vertical and radial lines, which would correspond to choosing s to be +oo or zero, we might use: s_0 = k/(1-k) and s_1 = k/SQRT(1 - k²) which represent the limiting extremes of the y-intercepts. I.e., refracted paths which start close to the y-axis, x = 0, will intersect the y-axis slightly above (0,-s_0), while paths that start nearly at the endpoints of the upper semi-circle are refracted more sharply and intersect slightly below (0,-s_1). For the sake of illustration, here is how one might construct an improved initial guess using these ideas. The Snell's Law formulation can be restated: G(x) = g(s) = p(s+y) - (s+q)x = 0 where y = (C*s - 1/s) and x = sign(p) SQRT(1 - y²). Interpolating between the values (s_0,g(s_0)) and (s_1,g(s_1)) gives an improved approximation s for "exact" intercept f, and locating the intersection of the line (0,-s) through (p,q) with the upper semi-circle gives an improved initial guess for (x,y). My goal to calculate an improved initial guess without expending more effort than roughly that for one Newton iteration, so that we jump into the mode of its quadratic convergence rate as soon as practical. regards, mathtalk-ga``` Clarification of Answer by mathtalk-ga on 12 Mar 2005 07:57 PST ```Oops, I left out a division by 2 in the penultimate paragraph of my last note. A line there should have read: where y = (C*s - 1/s)/2 and x = sign(p) SQRT(1 - y²)... instead of: where y = (C*s - 1/s) and x = sign(p) SQRT(1 - y²). -- mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 03 Jan 2005 09:14 PST
 ```Progress Report 1. The intuitive notion that (x,y) on the circle of radius r centered at the origin will depend continuously on the parameters (p,q), a point in the interior of the disk of that radius, leads to a mapping of the open disk into the circle. Such a mapping cannot be both continuous and onto because of topological considerations. 2. Two simplifications that can be readily introduced are to eliminate parameter r (by scaling x,y,p,q accordingly) and to combine m,n into a ratio: P = p/r, Q = q/r, X = x/r, Y = y/r Then X = cos(t), Y = sin(t) and: P*cot(t) - Q = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 ) regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 03 Jan 2005 13:11 PST
 ```Ooops, I got the tangent confused with the cotangent in that last equation, and it should read: P*tan(t) - Q = (n/m) SQRT( (cos(t) - P)^2 + (sin(t) - Q)^2 ) since y/x = sin(t)/cos(t) = tan(t). regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: johnswanstone-ga on 03 Jan 2005 17:00 PST
 ```Mathtalk, The analysis I've posted here deals with minimizing the time it takes a ray of light to travel from (x,r) to (p,q) according to Fermat's Principle and it leads to 6th order polynomials no matter which variable you solve for; x or y. Another analysis would be a geometrical one as follows: The equation of a circle in terms of an angle, a: x = r*sin(a) y = r*sin(b) Using the rule of similar angles of a parallel lines we know that a is also the angle that a ray of light traveling parallel to the y-axis will strike the surface of the semicircle. Using Snell's law: n*sin(a) = m*sin(b), where b will be the angle the light will refract at the boundary of the semicircle. Using the two point equation of a line through and : p-x = s * (q-y) we need to find the slope s which in this case will be express as run/rise. Using geometrical reasoning it is known that s = tan(a - b). So dividing the line equation by (q-y) gives: p-x --- = s q-y Substituting for 'x', 'y', and 's' gives: p-r*sin(a) ---------- = tan(a - b) q-r*cos(a) solving Snell's law for b: b = arcsin(sin(a)/n) a final substitution gives: p-r*sin(a) ---------- = tan(a - arcsin(sin(a)/n) q-r*cos(a) Which if solved for 'a' would also give me the solution I'm looking for. If it is impossible to solve this equation or the polynomial analytically then I'd accept a solution which describes and implements the "solve by factoring" algorithm used in symbolic algebra packages such as Mathmatica, Maple, Maxima, or MacSym. When I execute this function the computer comes back with all the roots of the polynomial including the complex roots. What I'd do then is use this to find the root that is the solution for my initial point and then use the computed as the initial guess for the next as per your earlier suggestion. I would also accept an analysis of this equation or the polynomial similar to the one in this paper on page 19 (http://www.math.uab.edu/cl/cl1/details.ps) which guarantees that a numerical method will always converge to the right solution given a known starting value. Let me know if there's anything more I can do to help you. Thanks, John.```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: johnswanstone-ga on 03 Jan 2005 17:19 PST
 ```Mathtalk, Addressing your question about what happens when = <0,0> Under this condition, the time it takes light to travel that path is reduced to: t = n/c * (r-y) + m/c * r My original analysis minimized this function by taking the derivitave with respect to y and setting it to zero, then solving for y. In this case that approach doesn't work since taking the derivitive leads to dt/dy = -n/c which is a constant. (Remember c here represents the speed of light in air). I don't know how to minimize this function ... maybe you do. John.```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 03 Jan 2005 18:16 PST
 ```To minimize: t = n/c * (r-y) + m/c * r with respect to y on the interval [-r,+r], take y = r. Here the minimum occurs at a boundary point, rather than at an interior "critical" point such as you might find by setting the derivative to zero. I have a minor correction to your alternative derivation; regarding Snell's Law: n*sin(a) = m*sin(b) we obtain: b = arcsin( (n/m) * sin(a) ) This suggests a natural limitation on angle a (equivalent to my angle t), that: | sin(a) | <= m/n in order that angle b is defined by this relation. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 04 Jan 2005 05:09 PST
 ```Hi, John: It appears that your problem has a left/right symmetry, so that: (x,y) the solution for (p,q) implies: (-x,y) the solution for (-p,q) but is not up/down symmetric in the corresponding sense. However the formulation: m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) implies that p*(y/x) - q is nonnegative, so y/x >= q/p when p > 0. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: johnswanstone-ga on 04 Jan 2005 09:46 PST
 ```Mathtalk, Going back to the original problem: t = n/c * (r-y) + m/c * sqrt((x-p)^2+(y-q)^2) and x^2 + y^2 = r^2 you can either substitute for y or x. Substituting for x gives an equation in y: t = n/c * (r-y) + m/c * sqrt((sqrt(-y^2+r^2)-p)^2+(-q+y)^2) Substituing for y gives and equation in x: t = n/c * (r-sqrt(-x^2+r^2)) + m/c * sqrt((x-p)^2+(sqrt(r^2-x^2)-q)^2) My problem is to minimize either one (or both) of these equations. i.e. Take the derivitive with respect to the remaining variable, set it equal to zero, and solve for the roots. Doing it either way leads to a 6th order polynomial. Solving the equation for x leads to a polynomial without the x^1 term. Solving the equation for y leads to a polynomial with all the terms. This equation: m*(p*y/x-q) = n*sqrt(r^2-2*p*x+p^2-2*q*y+q^2) is simply the derivitive of the equation in x set to 0 with y substituted back in. This was my first attempt and it turned out to be quite messy. I mention this because it seems like your focusing in on this equation and while solving it would answer my question, I wanted to point out the other equations as well since they might be easier to work with. John.```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 04 Jan 2005 10:21 PST
 ```Thanks, John. I'm posting a mixture of Comments here, both for the sake of giving a "progress" indicator and for the sake of your feedback. My understanding of the physics of the application is that, given a point (p,q) inside a "crystal ball" you want to know where to shine a light to illuminate that point. Outside the ball and inside the ball the light will travel in a straight line, but at the boundary (x,y) the light beam is refracted due to the changing speed of light. I can show a solution exists for p,q > 0, and by left/right symmetry we can drop the restriction that p > 0. However I'm expecting that there are values q < 0 which have no solution. The physical intuition is that as a light beam approach the vertical edge x = r, there may be points (p,q) "below" the path inside the ball that cannot be attained. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 05 Jan 2005 07:33 PST
 ```I haven't been able to show directly the equivalence of the two formulations, but they share a characteristic which I think can be used to show the equivalence indirectly. This is that (p,q) may be replaced in either equation by any point inside the circle on a line between (x,y) and (p,q). The Snell's Law formulation: (p-x)/(q-y) = tan(a - b) isolates all the dependence on (p,q) on one side of the equation, as the angles a,b depend only on (x,y) and the ratio of refraction indexes. So adding a multiple of (p-x) to p and the same multiple of (q-y) to q will leave the equation unchanged. The situation is only a little more complicated with the "original" formulation: p*(y/x) - q = (n/m)*SQRT( (p-x)^2 + (q-y)^2 ) but the corresponding adjustments to (p,q) would proportionally change both sides of the equation. An indirect verification of equivalence can therefore be derived by showing that the slopes of the lines as (p,q) approaches (x,y) are the same for both equations, e.g. dp/dq = tan(a - b). * * * * * * * * * * * * * * * Closer to your original request for an exact solution, I observe that for the special case q = 0, a cubic equation can be formulated for x with coefficients in p, r, and k = (n/m): p*(y/x) = k*SQRT( p^2 - 2px + r^2 ) After multiplying both sides by x and squaring: (p^2)*(r^2 - x^2) = (k^2)*(x^2)*(p^2 - 2px + r^2) Therefore an "exact solution" for x can be given in this case using Cardan's formula for roots of a cubic equation. See for example: http://answers.google.com/answers/threadview?id=433886 http://answers.google.com/answers/threadview?id=441538 * * * * * * * * * * * * * * * There would be a very simple exact solution if all of the lines (p,q) to (x,y) described in the first section of this Comment were to intersect at a common focus, say (0,f). For then one has only to intersect a line through (p,q) and (0,f) with the upper semi-circle x^2 + y^2 = r^2 to find (x,y). regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: johnswanstone-ga on 05 Jan 2005 11:17 PST
 ```Mathtalk, There would be a very simple exact solution if all of the lines (p,q) to (x,y) described in the first section of this Comment were to intersect at a common focus, say (0,f). For then one has only to intersect a line through (p,q) and (0,f) with the upper semi-circle x^2 + y^2 = r^2 to find (x,y). I agree that if all the lines (p,q) to (x,y) were to intersect at a common point then the solution would be easy. Unfortunatly this is not the case for general values of n and m. Your observation that "(p,q) may be replaced in either equation by any point inside the circle on a line between (x,y) and (p,q)." could be expoited to reduce the dimensionality of the problem if it could be reformulated using this knowledge (something like a coordinate transform or an equation using this property as the basis functions). This reduction in dimensionality may then enable an algebriac solution. What do you think? John.```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 05 Jan 2005 11:56 PST
 ```Hi, John: Yes, in a special case one can reduce the dimensionality to one (e.g q = 0) and obtain a reduced degree polynomial problem with a "closed form" solution. However I still expect that with good initial approximations, the numerical method will converge just as fast as any "exact" solution. I'm thinking of "projecting" points onto the x-axis (q=0) using an approximate focal point (p=0) mostly as a technique for generating the initial approximations. I believe that for a "spherical" lens, paraxial rays do have a fairly tight "focus"; it's the rays which fall near |x| = r which are responsible for the majority of "spherical aberration". I think there is a critical angle: |y/x| = n/m at which refraction changes to reflection that the math "mirrors" the physical limitation in your model. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 07 Jan 2005 07:27 PST
 ```Hi, John: Here's my approach, still to be coded and tested. I'm assuming n/m < 1 is the case of real interest. Given an array of (p,q) points: . . . . . . . . . . . . . . . . . . we solve (assuming these lie on one side of the y-axis, since solutions are symmetric about that line) for the extreme corners and midpoint, finding an approximate common focal point where the three lines thru respective (x,y) and (p,q) nearly intersect. Then use this approximate focal point (p',q') to produce initial approximations (x',y') for each additional solution desired. Two Newton iterations should suffice to give all necessary accuracy; three should converge to machine precision (digits of precision will double with each iteration), assuming a fairly tight angular distribution reasonably close to the central axis. See my Request for Clarification above about the realistic size of the angular range for the block of (p,q) points. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 09 Jan 2005 16:21 PST
 ```Hi, John: A couple of quick notes: I found a way to show that the Snell's Law and Fermat's principle formulations are equivalent for q < y. I coded and tested Newton iterations for both versions. They have about the same operation count, and for n/m < 1, both seem to converge satisfactorily within 2 or 3 iterations if given starting values of only modest accuracy. Based on the limited amount of testing done so far, the method based on Fermat's principle seems to converge a little faster. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: mathtalk-ga on 20 Jan 2005 14:20 PST
 ```The polynomial for x or y turns out to be sextic (of sixth degree), but it seems reasonable that by choosing a different unknown the problem may be restated in a form of lesser degree. One suggestion of this has already been seen when, by restricting q = 0, one obtains a formulation in which x satisfies a cubic polynomial (whose coefficients depend algebraically on p) and in which p satisfies a quadratic polynomial (whose coefficients depend algebraically on x). I'm especially interested in a formulation in terms of the y-intercept (0,-f) of the line passing through (p,q), rather than in terms of point (x,y) where the line intercepts the (unit) circle. It seems to me that the left/right symmetry of the problem may produce a reduction in the degree of the problem. For example, suppose that the coordinate -f of the y-intercept satisfies an even polynomial of degree 6 (whose coefficients depend algebraically on p,q). The square of f must then satisfy a corresponding cubic equation, so an explicit formula for f (in terms of root extractions and field operations) could in principle be written down. If k = n/m, the dependence of f (which I've written with a negative sign above because the light path incident on the upper semi-circle will have y-intercept in the lower half-plane) on x,y can be stated: __________ k²y + k? 1 - k²x² f = -------------------- (1 - k²) This gives values of f that range from: ____ f = k/?1-k² when (x,y) = (1,0) to: f = k/(1-k) when (x,y) = (0,1) Even if this line of investigation doesn't produce the sought after closed-form solution, it may still be useful in providing an "approximate focal point". That is, if we take an approximate y-intercept (0,-f) and project through (p,q) to the point on the unit circle (x,y), we may get a very satisfactory starting value for a Newton iteration. regards, mathtalk-ga```
 Subject: Re: Find roots of the polynomial: a*x^6+b*x^5+c*x^4+d*x^3+e*x^2+f From: johnswanstone-ga on 27 Jan 2005 22:00 PST
 ```Mathtalk, I agree with your "gut" feeling that if I were just smart enough to know how to reformulate the problem, I could get an algebriac solution. I'm thinking that your "y-intercept" idea is a good one. A coordinate system based on a point on the y-axis and an angle from vertical would seem to exactly encapsulate the free variables in this problem assuming of course that n/m < 1 (i.e. all things bend to the y-axis). John.```