Google Answers Logo
View Question
 
Q: Spectral Radius ( Answered,   0 Comments )
Question  
Subject: Spectral Radius
Category: Science > Math
Asked by: dooogle-ga
List Price: $40.00
Posted: 31 Jul 2005 06:35 PDT
Expires: 30 Aug 2005 06:35 PDT
Question ID: 550005
Dear MathTalk,
Firstly, I refer to a question posted at:
http://answers.google.com/answers/threadview?id=531867

You suggested that you had a simple proof of the continuity of the
spectral radius with respect to t, in particular at 0 (I'm not
interested in differentiability). I assume you show that the
eigenvalues (roots of det(M_t - \lambda I) = 0 are continuous. Would
you mind posting this?

Also, I have a question along similar lines. Let M be a square matrix
of size n. Let V be another square matrix of the same size n, where
all the rows of V are given by the same row vector v of size n. How
would one go about choosing vector v so that one might minimise the
spectral radius of the matrix given by (M - V)?

Many thanks,
Dooogle

Clarification of Question by dooogle-ga on 31 Jul 2005 06:41 PDT
Dear MathTalk,

If minimisation proves tricky, then, at the very least, I'd like a
strategy for ensuring that the spectral radius of (M-V) is less than
the spectral radius of M.

Thanks again,

Dooogle.
Answer  
Subject: Re: Spectral Radius
Answered By: mathtalk-ga on 31 Jul 2005 20:20 PDT
 
Hi, dooogle-ga:


Introduction
============

The continuous dependence of the roots (zeros) of a monic polynomial
on its coefficients is a well-known result, often used without
reference to a proof.  Once this theorem is in hand, the continuity of
the spectral radius of a matrix (in terms of its entries) is easily
shown.

An "elementary" (but perhaps not "simple") proof is given here:

[The roots of a polynomial depend continuously on its coefficients
by Raul Naulin and Carlos Pabst; Revista Colombiana de Matemáticas v.28(1) 1994]
http://www.scm.org.co/revistas.php?modulo=MasInformacion&ver=635

(see links at bottom of page for PDF or DVI formats)


Ostrowski (Acta Math. v.72, 1940) proved a more quantitative result, namely that:

If P(z),Q(z) are (complex) monic polynomials of equal degree n, with
roots x_1,...,x_n and y_1,...,y_n respectively, then (for some
suitable ordering of the y_j's) for all j:

   | x_j - y_j | <= 4n * T * D^(1/n)

where T >= 1 bounds the k'th roots of P & Q coefficients of z^(n-k),
and where D is the 2-norm of the difference of P & Q, ie. the square
root of the sum of squares of absolute values of the differences
between corresponding coefficients.

Away from multiple roots this estimate can be sharpened, as is shown in this paper:


[How the Roots of a Polynomial Vary with its Coefficients: A Local
Quantitative Result
by Bernard Beauzamy; Canad. Math. Bull. v. 42(1) 1999]
http://www.journals.cms.math.ca/cgi-bin/vault/public/view/beauzamy7281/body/PDF/beauzamy7281.pdf?file=beauzamy7281


*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Detailed Sketch of Proof
========================

We give an outline of the proof due to Naulin and Pabst, linked above.

A complex (monic) polynomial of degree n has (by the Fund. Thm. of
Algebra) exactly n roots in the complex field, counted according to
multiplicity:

  P(z) = z^n - c_1*z^(n-1) + ... + (-1)^n c_n

       = (z - r_1)*(z - r_2)*...*(z - r_n)

We would like to define a 1-1 mapping between the n-tuples of
coefficients and the n-tuples of roots, but a mechanical difficulty
arises in that the order of the roots is not apriori specified, at
least not in the obvious sense in which the order of the coefficients
is.

To manage this difficulty, Naulin and Pabst impose the lexicographic
ordering on the complex numbers:

    a + bi < c + di  iff  a < c OR (a = c and b < d)

They then restrict the n-tuples of complex roots so that:

    (r_1,r_2,...,r_n) in C_n  implies  r_1 <= r_2 <= ... <= r_n

Every such n-tuple is the ordered roots of some (complex) monic polynomial.

So we have a one-to-one correspondance between C_n, the possible
ordered n-tuples of roots, and C^n, the complex n-tuples, considered
as "non-leading" coefficients of monic polynomials of degree n.

In fact the mapping from roots to coefficients is pretty explicit,
taking into account the alternating signs on the c_i's we introduced
above:

    c_1 = r_1 + ... + r_n

    c_2 = r_1*r_2 + r_1*r_3 + ... + r_{n-1}*r_n

    ...

    c_n = r_1 * r_2 * ... * r_n

The expressions of the right-hand sides above are the elementary
symmetric polynomials, and collectively this correspondance is known
as the Vičte map (after Francois Vičte, 1540-1603, a pioneer in the
theory of algebraic equations).

Once they define a metric for C_n:

                                               n
    dist((r_1,...,r_n),(s_1,...,s_n)) = SQRT( SUM |r_i - s_i|^2 )
                                              i=1

it can be seen that C_n is a complete metric space, a property
inherited as a closed subspace of C^n, where the same metric is a
Euclidean norm.  For convenience we will refer to this norm as:

    ||a|| = ||(a_1,...,a_n)||

taking care that our context allows this for "vector" a without ambiguity.

The continuity of the Vičte map T:C_n -> C^n given by the above
formulas is then clear.  The big task is then to show that the inverse
of this bijective mapping, which they call S, is also continuous.

Toward this end they produce a Lemma:

    dist(S((a_1,...,a_n)),(0,...,0)) <= 2n * max{1,||a||}

whose proof, though easy, we shall not repeat here.

They now have all the ingredients needed to show S:C^n -> C_n is
continuous.  For if not, then take (a_1,...,a_n) to be a point of
discontinuity, so that for some delta > 0 and some sequence of
"vector" n-tuples {c(k): k=1 to oo}, c(k) = (c_{k,1},...,c_{k,n}):

     +oo
    LIMIT c(k)  =  a
     k=1

and dist(S(c_k),S(a)) >= delta  for all k.

The Lemma above establishes that the image sequence {S(c(k)): k=1 to
oo} is bounded, so that if necessary we can choose a subsequence of
the c(k)'s such that the images S(c(k)) converge in the complete
metric space C_n, say:

     +oo
    LIMIT S(c(k))  =  r
     k=1

But the continuity of the Vičte map T then implies:

               +oo              +oo               +oo
    T(r) = T( LIMIT S(c(k) ) = LIMIT T(S(c(k)) = LIMIT c(k) = a
               k=1              k=1               k=1

and hence that r = S(a).

By a standard argument from the convergence of subsequence c(k), this
shows that for k sufficiently large:

    dist(S(c(k)),S(a)) < delta

contradicting the original choice of sequence {c(k)} as never getting
that close, and establishing the theorem (continuity of T).

*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Continuity of R(r) = max{|r_i|} on C_n
======================================

It may be helpful, both for its own relevance and as an illustration
of how the topology defined on C_n "works", to explicitly prove that
R(r) = max{|r_i|} is a continuous function from C_n to R as we've
defined this metric topology on C_n.

Let r = (r_1,...,r_n) and s = (s_1,...,s_n) be properly ordered
n-tuples in C_n, and say:

    dist(r,s) = ||r - s|| > 0

Now for any component entries r_i and s_i, we have by the triangle inequality:

    |r_i| <= |s_i| + |r_i - s_i|

          <= |s_i| + ||r - s||

          <=  R(s) + dist(r,s)

Thus taking maximum with respect to i, we get:

     R(r) <=  R(s) + dist(r,s)

and  R(s) <=  R(r) + dist(r,s)  by a symmetric argument.

So  |R(r) - R(s)| <= dist(r,s), and it follows the function R is continuous on C_n.

Note that despite knowing that the entries of r are lexicographically
ordered in C, we lack any information about which component entry r_i
yields the maximum absolute value R(r).  Fortunately as the "norm"
metric takes all components into account, this gap in our knowledge is
no obstacle.


*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Continuity of the Spectral Radius of a Matrix
=============================================

Finally we only need to observe that as the coefficients of the
characteristic polynomial are a continuous function of a matrix M's
entries, applying first T to the non-leading coefficients and then R
will produce the spectral radius of M as a continuous function of
those same entries.


*  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *  *

Spectral Radius of a Special "Rank 1" Update to M
=================================================

I'm not clear if any of the properties of M or even M_t (such as
nonnegativity) from the other Question linked above were intended to
carry over to the second part of this Question.  Note that those
properties were not required to Answer the first part.  I will try to
make clear what of those properties I think are useful now.

The second part concerns the spectral radius of a matrix M-V where V has the form:

    (1,...,1)' * v

for an arbitrary row vector v.  The product of a column and a row
vector is a matrix of rank 1, and the numerical linear algebra term
for M-V is therefore a "rank 1 update" of M.

As v can be any scalar multiple of an arbitrary vector of length 1,
the probability is quite high that unless the spectral radius of M is
already zero (M nilpotent), that there exists a scalar multiple of a
randomly chosen "search direction" that will reduce the spectral
radius somewhat.

So let's focus first on what search directions we might like to avoid.

For a "positive" matrix M_t such as constructed in the other Question,
at least for t > 0, we are guaranteed to have a unique positive
eigenvalue q equal to the spectral radius and of multiplicity 1.  Let
u' be a unit length "positive" column vector which is a "right"
eigenvector for this eigenvalue.  Then:

    (M-V)u' = Mu' - (1,...,1)'*vu'

            = qu' - (vu')(1,...,1)'

Now if v were chosen orthogonal to u, then vu' = 0 and we can see from
this calculation that no such rank 1 update of M would reduce the
spectral radius (although taking the scalar multiple in v sufficiently
large would doubtless increase the spectral radius beyond some point).

If choosing a search direction v orthogonal to u is not helpful, we
should perhaps next consider taking v = tu where t is a scalar
multiple (though this is not the only possibility):

    (M-V)u' = Mu' - (1,...,1)'*vu'

            = qu' - (t,...,t)'

This is not a completely clear cut result, because no longer is it
evident or even likely that u' is a right eigenvector of the updated
matrix (M-V).  However since u has all entries positive and so does
(t,...,t) for t > 0, it seems almost inescapable that for at least
small values of t, the effect of this update is to reduce the spectral
radius.

Actually, since the maximum modulus eigenvalue for these matrices has
multiplicity 1, we are in a situation in which (again, for small
perturbations of M) the spectral radius will be differentiable with
respect to the parameter t.  Thus the only "bad" outcome would be that
this derivative is zero, which seems to me almost untenable.

If we are faced with a general matrix M, there need not be any search
direction which can reduce the spectral radius.  We have already
pointed out the case M nilpotent (spectral radius zero) as one such
possibility, but other cases where the maximum modulus eigenvalue has
multiplicity greater than 1 will share this frustration.

My suggestion, for additional research, is to focus on matrices that
are positive and to do some numerical experiments with v = tu or (more
generally) with v not orthogonal to u, to confirm if nothing else my
intuition about the direction of the effect of t on the spectral
radius sketched above.


regards, mathtalk-ga

Request for Answer Clarification by dooogle-ga on 01 Aug 2005 03:41 PDT
Dear MathTalk,
Many thanks. I've a had first read-through of your answer, and I'm
very thankful indeed. You've been extremely helpful.

Just in case you've any extra insights, thought I'd try sketch the
problem area I'm considering.

My concern is finding the unique solution (supposing this exists) of x
= Mx, where M is any square matrix (not necessarily irreducible, and
not necessarily non-negative). Now I'd like to do this in an iterative
fashion: x_{t+1} = Mx_t. Now, unique solution of x = Mx does not
entail that x_{t+1} = Mx_t converges. In a Markov setting, consider
three states A, B, C. If A -> B and B -> A and C->C, then a unique
stationary distrution exists, but we do not have necessary convergence
to this stationary distribution.

The spectral radius of M can give many insights into the solubuility
of the equation of x = Mx in an iterative fashion, as it can into the
rate of convergence if the equation is soluble. What I'd like to do is
map the equation x = Mx to equation x = (M-V)x + v with a view to
experiment with solubility and convergence rates.

Many thanks again.
Dooogle

Request for Answer Clarification by dooogle-ga on 01 Aug 2005 04:20 PDT
Addendum:

In what appears the most fruitful line of inquiry, I'm trying to find
v such that spectral radius of |(M-V)| is minimised (or at least
reduced) - where | M - V | is the matrix which has as entries the
absolute values of the corresponding entries in (M - V). So, as you
suggested, it seems I ought really really to consider non-negative
matrices. ;-)

Request for Answer Clarification by dooogle-ga on 01 Aug 2005 04:22 PDT
Please note: the t I use in the answer clarification is wholly
independent of the t mentioned in the original question.

Request for Answer Clarification by dooogle-ga on 02 Aug 2005 02:00 PDT
Dear MathTalk,

Wondering if you could please elaborate a little on the following?

"This is not a completely clear cut result, because no longer is it
evident or even likely that u' is a right eigenvector of the updated
matrix (M-V).  However since u has all entries positive and so does
(t,...,t) for t > 0, it seems almost inescapable that for at least
small values of t, the effect of this update is to reduce the spectral
radius.

Actually, since the maximum modulus eigenvalue for these matrices has
multiplicity 1, we are in a situation in which (again, for small
perturbations of M) the spectral radius will be differentiable with
respect to the parameter t.  Thus the only "bad" outcome would be that
this derivative is zero, which seems to me almost untenable."

In the first paragraph, I'm not clear on the importance of both u and
(t,...,t) being positive. And, I'm not clear on why this leads to your
inescapable specral radius reduction remark.

In the second paragraph, I'm not clear on why, with multiplicity 1, we
are in a situation where (for small perturbations of M) the spectral
radius will be differentiabl with respect to parameter t.

Thank you,
Dooogle

Clarification of Answer by mathtalk-ga on 02 Aug 2005 07:01 PDT
Hi, Dooogle-ga:

I'll be glad to fill out the details behind those cursory remarks.  I
wasn't too clear when I wrote them how closely tied your interests
were to the Markov chain case.  Now that I know there is still some
connection there, I'll give more complete information; I'll try to
give you something tonight.

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 03 Aug 2005 05:57 PDT
Recall that the Vičte map, which effectively gives a polynomials
coefficients in terms of its roots, is defined by elementary symmetric
polynomials (of the roots).

Indeed, to recall our earlier notation, but to extend the map to all of C^n:

  c_1 =  SUM r_i  FOR 1 <= i <= n

  c_2 =  SUM r_i * r_j  FOR 1 <= i < j <= n

  ...

  c_n =  PRODUCT r_i  FOR 1 <= i <= n

The k'th elementary symmetric polynomial of the roots r_i is the sum
of all products of distinct k-subsets of the roots.

Naturally this extension of S to all of C^n is a "smooth"
(differentiable) function because of this polynomial definition.  The
issue at hand is the "smoothness" of the inverse.

One way to discuss this is in terms of the inverse function theorem,
which says that the inverse of S will be differentiable at a point
where the Jacobian of S is nonzero, and indeed the "gradient" of the
(functional) inverse of S is the (matrix) inverse of the "gradient" of
S at that point:

[Inverse Function Theorem - Explanation]
http://www.ualberta.ca/dept/math/gauss/fcm/calculus/multvrbl/basic/ImplctFnctns/invrs_fnctn_explntn.htm

The Jacobian of the mapping S (think change of variables in calculus)
is the determinant of the matrix of first derivatives of components of
S (coefficients) with respect to the underlying variables (roots), aka
"gradient" of S.  The function S has n components and n independent
variables, so this is an nxn matrix (and it makes sense to speak of
its determinant).  The matrix is invertible if and only if its
determinant is non zero.

For this particular case of S, the Vičte map, the Jacobian turns out
to have this value, given roots r_1,...,r_n:

   PRODUCT (r_i - r_j)  FOR 1 <= i < j <= n

which can be shown by a pretty straightforward induction, applying
Gauss elimination to the gradient matrix (details on request).

Clearly this is nonzero if and only if all the roots are distinct.

In other words at a point where _all_ the roots are distinct, _all_
the "trajectories" of the roots as a function of the polynomials
coefficients are smooth (differentiable), as a consequence of the
inverse function theorem.

One can prove more, that if a particular root is distinct from all the
others, then its trajectory (as a function of the coefficients) is
smooth, regardless of whether some other roots has multiplicity
greater than 1.  It is this sharper result to which I alluded before.

An indication of why this is so can be found in the paper by Beauzamy
(linked above).  If the "inverse" function r_1(c_1,...,c_n), giving
the "first" root as a function of the polynomial's (non-leading)
coefficients, were smooth, then we would expect to have an estimate
for the distance between two values x,y on the "root trajectory" in
terms of the distance D between their respective coefficient arguments
like:

  | x - y | <= C * D

where C is a constant whose absolute value is determined by the
derivative of the function r_1(c_1,...,c_n) with respect to those
arguments.  And this is precisely the kind of improved estimate that
is given in that paper (for the case of one root being "isolated").

I think this discussion may be a bit too abstract to be immediately
helpful, and so the other thing I'd like to do is to work through a
small example.  It may be that even a perturbation of a 2x2 matrix
will provide useful insights.  Also, recall the examples given (under
Request for Clarification) on this thread:

[Google Answers: Spectral Radius of a Matrix]
http://answers.google.com/answers/threadview?id=531867

of some eigenvalue trajectories near points where we have a double eigenvalue.


regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 06 Aug 2005 11:05 PDT
Just a quick note to point out some aspects of this problem that make
it more amenable to numerical experiments than to a purely theoretical
analysis.

We have already discussed the nonlinear dependence (Vičte map) of the
coefficients of a characteristic polynomial on its roots (matrix
eigenvalues), and of course we are more interested in the inverse of
this mapping (dependence of roots, esp. the eigenvalue of maximum
modulus, on the coefficients).  The complexity is increased by the
nonlinear and combinatorial way in which the coefficients depend in
turn on the matrix entries (via the determinant).

This is not to say that the numerical experimentation cannot be guided
by theoretical understanding, the foundation of which I tried to lay
above.  If we look at the "pencil" of matrices:

   M - cV  =  M - c*(1,...,1)'(v_1,...,v_n)

with variable scalar c, then the spectral radius is (the absolute
value of) the trajectory of the maximum modulus eigenvalue r(c) as a
continuous function of c.

We have seen that in a "nice" case such as a nonnegative matrix there
will be a unique such eigenvalue, and that away from multiplicity > 1
the trajectory is not just continuous but differentiable with respect
to c.

Finding the maximum modulus eigenvalue, for example by a power method,
is computationally tractable, and we can envision doing a numerical
experiment to confirm that the effect of using (v_1,...,v_n) as the
"left" eigenvector corresponding to that eigenvalue will produce a
reduction in spectral radius.

Of course while we are perturbing (hopefully decreasing) the maximum
eigenvalue, we will at the same time be perturbing (and possibly
increasing) the other eigenvalues, so we do not expect that an
unlimited decrease in spectral radius is attainable (except in the
special case that M is of rank one and a multiple of V, where of
course we could drop the spectral radius all the way to zero).

So it seems to me that experiments will help, not only to show we are
thinking correctly about the general relationships, but to develop
some hypothesis about how much work is needed to obtain a given
reduction, and of course suggest further experiments with more
sophisticated strategies.

If you have in mind a particular matrix of modest dimensions, e.g. 10
by 10, that would be good to play with, I'd be happy to do some
calculations.


regards, mathtalk-ga
Comments  
There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at answers-support@google.com with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  


Google Home - Answers FAQ - Terms of Service - Privacy Policy