Hi, dooogle-ga:
Introduction
============
The continuous dependence of the roots (zeros) of a monic polynomial
on its coefficients is a well-known result, often used without
reference to a proof. Once this theorem is in hand, the continuity of
the spectral radius of a matrix (in terms of its entries) is easily
shown.
An "elementary" (but perhaps not "simple") proof is given here:
[The roots of a polynomial depend continuously on its coefficients
by Raul Naulin and Carlos Pabst; Revista Colombiana de Matemáticas v.28(1) 1994]
http://www.scm.org.co/revistas.php?modulo=MasInformacion&ver=635
(see links at bottom of page for PDF or DVI formats)
Ostrowski (Acta Math. v.72, 1940) proved a more quantitative result, namely that:
If P(z),Q(z) are (complex) monic polynomials of equal degree n, with
roots x_1,...,x_n and y_1,...,y_n respectively, then (for some
suitable ordering of the y_j's) for all j:
| x_j - y_j | <= 4n * T * D^(1/n)
where T >= 1 bounds the k'th roots of P & Q coefficients of z^(n-k),
and where D is the 2-norm of the difference of P & Q, ie. the square
root of the sum of squares of absolute values of the differences
between corresponding coefficients.
Away from multiple roots this estimate can be sharpened, as is shown in this paper:
[How the Roots of a Polynomial Vary with its Coefficients: A Local
Quantitative Result
by Bernard Beauzamy; Canad. Math. Bull. v. 42(1) 1999]
http://www.journals.cms.math.ca/cgi-bin/vault/public/view/beauzamy7281/body/PDF/beauzamy7281.pdf?file=beauzamy7281
* * * * * * * * * * * * * * * * * * *
Detailed Sketch of Proof
========================
We give an outline of the proof due to Naulin and Pabst, linked above.
A complex (monic) polynomial of degree n has (by the Fund. Thm. of
Algebra) exactly n roots in the complex field, counted according to
multiplicity:
P(z) = z^n - c_1*z^(n-1) + ... + (-1)^n c_n
= (z - r_1)*(z - r_2)*...*(z - r_n)
We would like to define a 1-1 mapping between the n-tuples of
coefficients and the n-tuples of roots, but a mechanical difficulty
arises in that the order of the roots is not apriori specified, at
least not in the obvious sense in which the order of the coefficients
is.
To manage this difficulty, Naulin and Pabst impose the lexicographic
ordering on the complex numbers:
a + bi < c + di iff a < c OR (a = c and b < d)
They then restrict the n-tuples of complex roots so that:
(r_1,r_2,...,r_n) in C_n implies r_1 <= r_2 <= ... <= r_n
Every such n-tuple is the ordered roots of some (complex) monic polynomial.
So we have a one-to-one correspondance between C_n, the possible
ordered n-tuples of roots, and C^n, the complex n-tuples, considered
as "non-leading" coefficients of monic polynomials of degree n.
In fact the mapping from roots to coefficients is pretty explicit,
taking into account the alternating signs on the c_i's we introduced
above:
c_1 = r_1 + ... + r_n
c_2 = r_1*r_2 + r_1*r_3 + ... + r_{n-1}*r_n
...
c_n = r_1 * r_2 * ... * r_n
The expressions of the right-hand sides above are the elementary
symmetric polynomials, and collectively this correspondance is known
as the Vičte map (after Francois Vičte, 1540-1603, a pioneer in the
theory of algebraic equations).
Once they define a metric for C_n:
n
dist((r_1,...,r_n),(s_1,...,s_n)) = SQRT( SUM |r_i - s_i|^2 )
i=1
it can be seen that C_n is a complete metric space, a property
inherited as a closed subspace of C^n, where the same metric is a
Euclidean norm. For convenience we will refer to this norm as:
||a|| = ||(a_1,...,a_n)||
taking care that our context allows this for "vector" a without ambiguity.
The continuity of the Vičte map T:C_n -> C^n given by the above
formulas is then clear. The big task is then to show that the inverse
of this bijective mapping, which they call S, is also continuous.
Toward this end they produce a Lemma:
dist(S((a_1,...,a_n)),(0,...,0)) <= 2n * max{1,||a||}
whose proof, though easy, we shall not repeat here.
They now have all the ingredients needed to show S:C^n -> C_n is
continuous. For if not, then take (a_1,...,a_n) to be a point of
discontinuity, so that for some delta > 0 and some sequence of
"vector" n-tuples {c(k): k=1 to oo}, c(k) = (c_{k,1},...,c_{k,n}):
+oo
LIMIT c(k) = a
k=1
and dist(S(c_k),S(a)) >= delta for all k.
The Lemma above establishes that the image sequence {S(c(k)): k=1 to
oo} is bounded, so that if necessary we can choose a subsequence of
the c(k)'s such that the images S(c(k)) converge in the complete
metric space C_n, say:
+oo
LIMIT S(c(k)) = r
k=1
But the continuity of the Vičte map T then implies:
+oo +oo +oo
T(r) = T( LIMIT S(c(k) ) = LIMIT T(S(c(k)) = LIMIT c(k) = a
k=1 k=1 k=1
and hence that r = S(a).
By a standard argument from the convergence of subsequence c(k), this
shows that for k sufficiently large:
dist(S(c(k)),S(a)) < delta
contradicting the original choice of sequence {c(k)} as never getting
that close, and establishing the theorem (continuity of T).
* * * * * * * * * * * * * * * * * * *
Continuity of R(r) = max{|r_i|} on C_n
======================================
It may be helpful, both for its own relevance and as an illustration
of how the topology defined on C_n "works", to explicitly prove that
R(r) = max{|r_i|} is a continuous function from C_n to R as we've
defined this metric topology on C_n.
Let r = (r_1,...,r_n) and s = (s_1,...,s_n) be properly ordered
n-tuples in C_n, and say:
dist(r,s) = ||r - s|| > 0
Now for any component entries r_i and s_i, we have by the triangle inequality:
|r_i| <= |s_i| + |r_i - s_i|
<= |s_i| + ||r - s||
<= R(s) + dist(r,s)
Thus taking maximum with respect to i, we get:
R(r) <= R(s) + dist(r,s)
and R(s) <= R(r) + dist(r,s) by a symmetric argument.
So |R(r) - R(s)| <= dist(r,s), and it follows the function R is continuous on C_n.
Note that despite knowing that the entries of r are lexicographically
ordered in C, we lack any information about which component entry r_i
yields the maximum absolute value R(r). Fortunately as the "norm"
metric takes all components into account, this gap in our knowledge is
no obstacle.
* * * * * * * * * * * * * * * * * * *
Continuity of the Spectral Radius of a Matrix
=============================================
Finally we only need to observe that as the coefficients of the
characteristic polynomial are a continuous function of a matrix M's
entries, applying first T to the non-leading coefficients and then R
will produce the spectral radius of M as a continuous function of
those same entries.
* * * * * * * * * * * * * * * * * * *
Spectral Radius of a Special "Rank 1" Update to M
=================================================
I'm not clear if any of the properties of M or even M_t (such as
nonnegativity) from the other Question linked above were intended to
carry over to the second part of this Question. Note that those
properties were not required to Answer the first part. I will try to
make clear what of those properties I think are useful now.
The second part concerns the spectral radius of a matrix M-V where V has the form:
(1,...,1)' * v
for an arbitrary row vector v. The product of a column and a row
vector is a matrix of rank 1, and the numerical linear algebra term
for M-V is therefore a "rank 1 update" of M.
As v can be any scalar multiple of an arbitrary vector of length 1,
the probability is quite high that unless the spectral radius of M is
already zero (M nilpotent), that there exists a scalar multiple of a
randomly chosen "search direction" that will reduce the spectral
radius somewhat.
So let's focus first on what search directions we might like to avoid.
For a "positive" matrix M_t such as constructed in the other Question,
at least for t > 0, we are guaranteed to have a unique positive
eigenvalue q equal to the spectral radius and of multiplicity 1. Let
u' be a unit length "positive" column vector which is a "right"
eigenvector for this eigenvalue. Then:
(M-V)u' = Mu' - (1,...,1)'*vu'
= qu' - (vu')(1,...,1)'
Now if v were chosen orthogonal to u, then vu' = 0 and we can see from
this calculation that no such rank 1 update of M would reduce the
spectral radius (although taking the scalar multiple in v sufficiently
large would doubtless increase the spectral radius beyond some point).
If choosing a search direction v orthogonal to u is not helpful, we
should perhaps next consider taking v = tu where t is a scalar
multiple (though this is not the only possibility):
(M-V)u' = Mu' - (1,...,1)'*vu'
= qu' - (t,...,t)'
This is not a completely clear cut result, because no longer is it
evident or even likely that u' is a right eigenvector of the updated
matrix (M-V). However since u has all entries positive and so does
(t,...,t) for t > 0, it seems almost inescapable that for at least
small values of t, the effect of this update is to reduce the spectral
radius.
Actually, since the maximum modulus eigenvalue for these matrices has
multiplicity 1, we are in a situation in which (again, for small
perturbations of M) the spectral radius will be differentiable with
respect to the parameter t. Thus the only "bad" outcome would be that
this derivative is zero, which seems to me almost untenable.
If we are faced with a general matrix M, there need not be any search
direction which can reduce the spectral radius. We have already
pointed out the case M nilpotent (spectral radius zero) as one such
possibility, but other cases where the maximum modulus eigenvalue has
multiplicity greater than 1 will share this frustration.
My suggestion, for additional research, is to focus on matrices that
are positive and to do some numerical experiments with v = tu or (more
generally) with v not orthogonal to u, to confirm if nothing else my
intuition about the direction of the effect of t on the spectral
radius sketched above.
regards, mathtalk-ga |