View Question
Q: How to solve a Markov Chain by using MATLAB ( Answered ,   0 Comments )
 Question
 Subject: How to solve a Markov Chain by using MATLAB Category: Science > Math Asked by: shiarung-ga List Price: \$20.00 Posted: 16 Aug 2003 03:18 PDT Expires: 15 Sep 2003 03:18 PDT Question ID: 245302
 ```How can I solve a Markov Chain (as follows) by using MATLAB ? Please provide codes (ignore the part of input data to P) and make sure the codes work. Slove following equations derived from a Markov Chain: xP = x x1 + x2 + x3 ... + xn = 1 where, x = [x1 x2 ... xn] <=( x1, x2, ..., xn are variables) P = p11 p12 ... p1n p21 p22 ... p2n p31 p32 ... p3n . . . . pn1 pn2 ... pnn pij (1<=i<=n, 1<=j<=n) is a constant at i-row j-column```
 ```Hi, shiarung-ga: The solution can be obtained (after the necessary data entry is done) with a single Matlab command. For anyone interested in following along, the Mathworks documentation for Matlab commands/functions can be found here: [MATLAB Documentation] http://www.mathworks.com/access/helpdesk/help/techdoc/matlab.shtml In mathematical terms we want to solve a system of the form: (P - I) x = 0 subject to the constraint: x1 + x2 + ... + xn = 1 This last simply adds an additional row to our "matrix equation", so an appropriate Matlab expression can be constructed using the "\" operator to request the solution of a matrix. Let's assume that P is a square matrix and already has the coefficients (transition probabilities) for the given Markov chain. Then we can run: > [P - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1); 1] which will solve the system of equations above and provide the components of x as a "column" result. A brief explanation of this command is as follows: eye(size(P)) is an identity matrix of the same "shape" as P. After subtracting off those diagonal entries from P, we have (within the limits of numerical rounding) a singular matrix P - I. The "\" operator solves a linear system. For example: > A \ b would ask for the solution x of a matrix equation Ax = b. The solution is obtained without "inverting" the matrix A, so it can applied (as we did above) to cases in which the matrix is not invertible. Here the matrix equation we are solving can be described in block form as: [ x_1 ] [ 0 ] [ (P - I) ] [ x_2 ] = [ 0 ] [ 1 1 ... 1 ] [ ... ] [...] [ x_n ] [ 0 ] [ 1 ] and the "matrix construction" expression given above reproduces this faithfully with the use of utility functions zeros(j,k) and ones(j,k) for submatrices of j rows and k columns that contain all zero or all one entries, respectively. If rounding errors are a problem, then a slightly fancier matrix formulation can be tried. Add a "slack variable" y to the "solution" vector x, and consider: [ (P - I) | U' ] [ x ] = [ 0 ] [ U | 0 ] [ y ] [ 1 ] where U = [1,1,...,1] is a row of all ones, the same length as rows of P, and U' is written for transpose of U. We would construct the "bordered" matrix shown here as follows: > Q = [P - eye(size(P)), ones(rows(P),1); ones(1,columns(P)), 0] Solve the new system like this: > Q \ [zeros(rows(P),1); 1] Ideally the extra value y turns out to be exactly zero, but in practice for larger systems the value y will only be "nearly zero" and gives a sense of the influence of rounding errors on the solution x. Readers who would like to duplicate these steps but don't have the funds with which to purchase Matlab may be interested in Octave, a similar computing environment which is freely available under the GNU general public license: [GNU Octave] http://www.octave.org/ regards, mathtalk-ga``` Clarification of Answer by mathtalk-ga on 16 Aug 2003 10:11 PDT ```I should point out that, as you did in formulating the original Markov chain problem, it is customary to put the unknowns "x" into the matrix equation: xP = x as a row, so that consequently the matrix P as well as the solution x described in your question are actually the transposes of the corresponding expressions in my answer. This is not difficult to "fix" because Matlab provides the transpose of a matrix by applying the apostrophe operator ' (just as I did for notational convenience in my answer). I've noticed that whether the transition probability matrix P is placed on the right (as you did) or (in transposed form) on the left (as I did) varies from author to author. I've never been able to figure out if this is an engineering versus economist issue, or perhaps American versus European. Nonetheless, while the solution of linear systems Ax = b consistently "expects" the coefficient matrix on the left, dealing with Markov matrices requires one to be attentive to this detail. regards, mathtalk-ga``` Clarification of Answer by mathtalk-ga on 16 Aug 2003 18:03 PDT ```Since I broached the confusing subject of distinguishing left from right (with regard to the matrix of transition probabilities), perhaps I should divulge my own criteria for keeping the distinction clear. Let's start with the xP = x equation, given in the original question, as this "right-handed" version actually makes for the most natural definition of P in terms of "transition probabilities. Here x represents (apropos of modelling with a Markov chain) a distribution of "state" among the components of x, which represent probabilities that sum to 1. In other words, if there were n possible states with some specified order, then x should list the respective probability for being in each particular state. Note that Markov chains deal with "random" behavior, so that even if a system is known to be initialized in one particular state, the state is typically not known to be in any single state after any step of the Markov process. The Markov chain models this "nondeterminism" with the state vector x. The equation xP = x describes an equilibrium state, in which the transitions into and out of each component exactly balance. There could be more than one equilibrium state, but it can be proven there must be at least one. Recall that for the purpose of the Markov chain interpretation, the components of x have to be probabilities that sum to 1, so in particular each x_k is between 0 and 1. With this interpretation of the row vector x in mind, we can define the entries of P like this: P_{i,j} [read: P subscript i comma j] is the probability of transitioning from state i into state j. Note that if the system is in state i, it will transition into _some_ state. That is the sum of all transition probabilities _from_ state i must add up to 1: 1 = SUM P_{i,j} FOR all j In other words the matrix P, as defined above, has the property that in any row i, the sum of all the entries of that row is 1. More concisely, each row sum of P is 1. Now P is not symmetric in general, because there is no a priori reason that the chance of going from state i to state j should equal the probability of going from state j to state i. Consequently we cannot say much about the column sums of P. They must be nonnegative since all entries of P are nonnegative, but the column sums need not be equal to 1 individually. However the sum of all row sums is equal to the sum of all column sums, so "on average" the column sums are 1. The property that P has of all row sums being 1 is precisely what makes the sequence of vectors: v = uP w = vP . . . preserve a property of vectors u,v,w that their components will always sum to 1. To show this with a bit of matrix algebra, notice that if we let: U = [1,1,...,1] be a row vector of appropriate length having all entries 1, then the sum of the entries in vector u is simply the matrix product: u U' or (if preferred) the dot product between u and U. So if we knew u U' = 1 to begin with, then: v U' = (uP)U' = u(P U') Now use the fact that each row sum of P is 1. That is: P U' = U' since each row of P times the single column U' gives back an entry 1. Thus: v U' = u(P U') = u U' = 1 and so on for w and any further "steps" in the Markov process by induction. Let's take notice of the fact that P U' = U' means that P has an eigenvector U' corresponding to the eigenvalue 1. One might even press the distinction by saying U' is a "right" eigenvector of P, since here U' appears multiplying P on the right. What falls out of the general machinery of eigenvalues, through the "characteristic polynomial", is that P and its transpose P' have equal eigenvalues (though not necessarily any common eigenvectors, of course). Thus, as P has a "right" eigenvector U' corresponding to eigenvalue 1, it must also have a "left" eigenvector x corresponding to eigenvalue 1: xP = x <==> P' x' = x' Note that the discussion has now come full circle to the original "equilibrium" solution x. The main point to retain from this discussion is that where P as given in your original question would have row sums equal to 1, the transpose P' (which is the coefficient matrix used in my Matlab code) will instead have column sums that are equal to 1. This distinction will ordinarily suffice to keep straight the transition matrix "handedness". regards, mathtalk-ga``` Request for Answer Clarification by shiarung-ga on 19 Aug 2003 03:24 PDT ```Hi mathtalk-ga, (1) In my question, the sum of each row in transition probability matrix is one. So what I have to do is to use following command to get the anwsers >[P' - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1);1] where, P is a N-by-N matrix. Am I right ? (2) The command "columns" should be replaced by the command "cols". Do I miss anything ?``` Clarification of Answer by mathtalk-ga on 19 Aug 2003 09:06 PDT ```Hi, shiarung-ga: Yes, you are correct. Since P is a square matrix, the only change required is to use P' in place P where first mentioned: > [P' - eye(size(P)); ones(1,columns(P))] \ [zeros(rows(P),1);1] As far as I can see from the current Matlab documentation on the Web, neither "columns", "cols", or even "rows" is a supported function. Perhaps out of a desire to support multidimensional arrays, the original emphasis on two-dimensional matrices is being deliberately removed. So for the sake of future compatibility, we should do this instead: > [P' - eye(size(P)); ones(1,length(P))] \ [zeros(length(P),1);1] The length function is equivalent to finding the maximum of all the values returned by the size function, for a multidimensional array. When P is N by N, of course both "row" and "column" counts are N. Use the search feature from the Matlab documentation link supplied above; due to Mathworks' use of frames it's difficult to provide a direct link to their pages for "length" and "size". The "fully bordered" approach, which I wrote as two lines merely for the sake of clarity, then becomes: > Q = [P' - eye(size(P)), ones(length(P),1); ones(1,length(P)), 0] > Q \ [zeros(length(P),1); 1] regards, mathtalk-ga```