Google Answers Logo
View Question
Q: How to solve a Markov Chain by using MATLAB ( Answered 5 out of 5 stars,   0 Comments )
Subject: How to solve a Markov Chain by using MATLAB
Category: Science > Math
Asked by: shiarung-ga
List Price: $20.00
Posted: 16 Aug 2003 03:18 PDT
Expires: 15 Sep 2003 03:18 PDT
Question ID: 245302
How can I solve a Markov Chain (as follows) by using MATLAB ? 
Please provide codes (ignore the part of input data to P) and 
make sure the codes work.

Slove following equations derived from a Markov Chain: 

 xP = x 
 x1 + x2 + x3 ... + xn = 1

x = [x1 x2 ... xn] <=( x1, x2, ..., xn are variables)
P = p11 p12 ... p1n 
    p21 p22 ... p2n 
    p31 p32 ... p3n 
     .   .   .   .  
    pn1 pn2 ... pnn 
pij (1<=i<=n, 1<=j<=n) is a constant at i-row j-column
Subject: Re: How to solve a Markov Chain by using MATLAB
Answered By: mathtalk-ga on 16 Aug 2003 10:01 PDT
Rated:5 out of 5 stars
Hi, shiarung-ga:

The solution can be obtained (after the necessary data entry is done)
with a single Matlab command.  For anyone interested in following
along, the Mathworks documentation for Matlab commands/functions can
be found here:

[MATLAB Documentation]

In mathematical terms we want to solve a system of the form:

(P - I) x = 0

subject to the constraint:

x1 + x2 + ... + xn = 1

This last simply adds an additional row to our "matrix equation", so
an appropriate Matlab expression can be constructed using the "\"
operator to request the solution of a matrix.

Let's assume that P is a square matrix and already has the
coefficients (transition probabilities) for the given Markov chain. 
Then we can run:

> [P - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1); 1]

which will solve the system of equations above and provide the
components of x as a "column" result.

A brief explanation of this command is as follows:

eye(size(P)) is an identity matrix of the same "shape" as P.  After
subtracting off those diagonal entries from P, we have (within the
limits of numerical rounding) a singular matrix P - I.

The "\" operator solves a linear system.  For example:

> A \ b

would ask for the solution x of a matrix equation Ax = b.  The
solution is obtained without "inverting" the matrix A, so it can
applied (as we did above) to cases in which the matrix is not

Here the matrix equation we are solving can be described in block form

              [ x_1 ]     [ 0 ]
[  (P - I)  ] [ x_2 ]  =  [ 0 ]
[ 1 1 ... 1 ] [ ... ]     [...]
              [ x_n ]     [ 0 ]
                          [ 1 ]

and the "matrix construction" expression given above reproduces this
faithfully with the use of utility functions zeros(j,k) and ones(j,k)
for submatrices of j rows and k columns that contain all zero or all
one entries, respectively.

If rounding errors are a problem, then a slightly fancier matrix
formulation can be tried.  Add a "slack variable" y to the "solution"
vector x, and consider:

[  (P - I) |  U' ] [ x ] = [ 0 ]
[     U    |  0  ] [ y ]   [ 1 ]

where U = [1,1,...,1] is a row of all ones, the same length as rows of
P, and U' is written for transpose of U.  We would construct the
"bordered" matrix shown here as follows:

> Q = [P - eye(size(P)), ones(rows(P),1); ones(1,columns(P)), 0]

Solve the new system like this:

> Q \ [zeros(rows(P),1); 1]

Ideally the extra value y turns out to be exactly zero, but in
practice for larger systems the value y will only be "nearly zero" and
gives a sense of the influence of rounding errors on the solution x.

Readers who would like to duplicate these steps but don't have the
funds with which to purchase Matlab may be interested in Octave, a
similar computing environment which is freely available under the GNU
general public license:

[GNU Octave]

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 16 Aug 2003 10:11 PDT
I should point out that, as you did in formulating the original Markov
chain problem, it is customary to put the unknowns "x" into the matrix

xP = x

as a row, so that consequently the matrix P as well as the solution x
described in your question are actually the transposes of the
corresponding expressions in my answer.

This is not difficult to "fix" because Matlab provides the transpose
of a matrix by applying the apostrophe operator ' (just as I did for
notational convenience in my answer).

I've noticed that whether the transition probability matrix P is
placed on the right (as you did) or (in transposed form) on the left
(as I did) varies from author to author.  I've never been able to
figure out if this is an engineering versus economist issue, or
perhaps American versus European.  Nonetheless, while the solution of
linear systems Ax = b consistently "expects" the coefficient matrix on
the left, dealing with Markov matrices requires one to be attentive to
this detail.

regards, mathtalk-ga

Clarification of Answer by mathtalk-ga on 16 Aug 2003 18:03 PDT
Since I broached the confusing subject of distinguishing left from
right (with regard to the matrix of transition probabilities), perhaps
I should divulge my own criteria for keeping the distinction clear.

Let's start with the xP = x equation, given in the original question,
as this "right-handed" version actually makes for the most natural
definition of P in terms of "transition probabilities.

Here x represents (apropos of modelling with a Markov chain) a
distribution of "state" among the components of x, which represent
probabilities that sum to 1.  In other words, if there were n possible
states with some specified order, then x should list the respective
probability for being in each particular state.

Note that Markov chains deal with "random" behavior, so that even if a
system is known to be initialized in one particular state, the state
is typically not known to be in any single state after any step of the
Markov process.  The Markov chain models this "nondeterminism" with
the state vector x.

The equation xP = x describes an equilibrium state, in which the
transitions into and out of each component exactly balance.   There
could be more than one equilibrium state, but it can be proven there
must be at least one.  Recall that for the purpose of the Markov chain
interpretation, the components of x have to be probabilities that sum
to 1, so in particular each x_k is between 0 and 1.

With this interpretation of the row vector x in mind, we can define
the entries of P like this:

P_{i,j} [read: P subscript i comma j] is the probability of
transitioning from state i into state j.

Note that if the system is in state i, it will transition into _some_
state.  That is the sum of all transition probabilities _from_ state i
must add up to 1:

1 = SUM  P_{i,j} FOR all j 

In other words the matrix P, as defined above, has the property that
in any row i, the sum of all the entries of that row is 1.  More
concisely, each row sum of P is 1.

Now P is not symmetric in general, because there is no a priori reason
that the chance of going from state i to state j should equal the
probability of going from state j to state i.  Consequently we cannot
say much about the column sums of P.  They must be nonnegative since
all entries of P are nonnegative, but the column sums need not be
equal to 1 individually.  However the sum of all row sums is equal to
the sum of all column sums, so "on average" the column sums are 1.

The property that P has of all row sums being 1 is precisely what
makes the sequence of vectors:

v = uP

w = vP

. . .

preserve a property of vectors u,v,w that their components will always
sum to 1.  To show this with a bit of matrix algebra, notice that if
we let:

U = [1,1,...,1]

be a row vector of appropriate length having all entries 1, then the
sum of the entries in vector u is simply the matrix product:

u U'

or (if preferred) the dot product between u and U.  So if we knew u U'
= 1 to begin with, then:

v U' = (uP)U' = u(P U')

Now use the fact that each row sum of P is 1.  That is:

P U' = U'

since each row of P times the single column U' gives back an entry 1. 

v U' = u(P U') = u U' = 1

and so on for w and any further "steps" in the Markov process by

Let's take notice of the fact that P U' = U' means that P has an
eigenvector U' corresponding to the eigenvalue 1.  One might even
press the distinction by saying U' is a "right" eigenvector of P,
since here U' appears multiplying P on the right.  What falls out of
the general machinery of eigenvalues, through the "characteristic
polynomial", is that P and its transpose P' have equal eigenvalues
(though not necessarily any common eigenvectors, of course).  Thus, as
P has a "right" eigenvector U' corresponding to eigenvalue 1, it must
also have a "left" eigenvector x corresponding to eigenvalue 1:

xP = x  <==> P' x' = x'

Note that the discussion has now come full circle to the original
"equilibrium" solution x.

The main point to retain from this discussion is that where P as given
in your original question would have row sums equal to 1, the
transpose P' (which is the coefficient matrix used in my Matlab code)
will instead have column sums that are equal to 1.  This distinction
will ordinarily suffice to keep straight the transition matrix

regards, mathtalk-ga

Request for Answer Clarification by shiarung-ga on 19 Aug 2003 03:24 PDT
Hi mathtalk-ga,

In my question, the sum of each row in transition probability matrix is one.
So what I have to do is to use following command to get the anwsers

>[P' - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1);1]

where, P is a N-by-N matrix. Am I right ?

The command "columns" should be replaced by the command "cols". Do I miss
anything ?

Clarification of Answer by mathtalk-ga on 19 Aug 2003 09:06 PDT
Hi, shiarung-ga:

Yes, you are correct.  Since P is a square matrix, the only change
required is to use P' in place P where first mentioned:

> [P' - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1);1]

As far as I can see from the current Matlab documentation on the Web,
neither "columns", "cols", or even "rows" is a supported function. 
Perhaps out of a desire to support multidimensional arrays, the
original emphasis on two-dimensional matrices is being deliberately

So for the sake of future compatibility, we should do this instead:

> [P' - eye(size(P)); ones(1,length(P))] \ [zeros(length(P),1);1]

The length function is equivalent to finding the maximum of all the
values returned by the size function, for a multidimensional array. 
When P is N by N, of course both "row" and "column" counts are N.  Use
the search feature from the Matlab documentation link supplied above;
due to Mathworks' use of frames it's difficult to provide a direct
link to their pages for "length" and "size".

The "fully bordered" approach, which I wrote as two lines merely for
the sake of clarity, then becomes:

> Q = [P' - eye(size(P)), ones(length(P),1); ones(1,length(P)), 0] 
> Q \ [zeros(length(P),1); 1] 
regards, mathtalk-ga
shiarung-ga rated this answer:5 out of 5 stars

There are no comments at this time.

Important Disclaimer: Answers and comments provided on Google Answers are general information, and are not intended to substitute for informed professional medical, psychiatric, psychological, tax, legal, investment, accounting, or other professional advice. Google does not endorse, and expressly disclaims liability for any product, manufacturer, distributor, service or service provider mentioned or any opinion expressed in answers or comments. Please read carefully the Google Answers Terms of Service.

If you feel that you have found inappropriate content, please let us know by emailing us at with the question ID listed above. Thank you.
Search Google Answers for
Google Answers  

Google Home - Answers FAQ - Terms of Service - Privacy Policy