Absorbing Markov Chains

Lucas Mentch

 

†††††††††††

Often times, mathematical models can be used as a main tool for making informed decisions.Markov Chains, specifically absorbing Markov chains, are often useful in constructing a mathematical model of a situation involving experiments with multiple outcomes where the outcome of a given trial depends only on the outcome of the previous trial.Consider the following example involving typical human behavior.

 

Suppose we are at an amusement park.After paying top dollar to get into the park, we are ready to get on some rides.Suppose that in this park, there are only three rides:a roller coaster, a water ride, and a boring ride.The line for the roller coaster is always very long, the line for the water ride is always moderately long, and the line for the boring ride is, predictably, always short.Since we donít want to wait in line all day, we are unlikely to ride the roller coaster several times in a row, but we still want to have fun, so we canít ride the boring ride all day.The wait for the ride we are currently on will affect which ride we want to get on next.That is, given that we are on a certain ride, the next ride we will get on depends only on the type of ride we are currently riding and no other previously ridden rides.†††

 

When the outcome of a given experiment can affect the outcome of the next experiment, the process can be referred to as a Markov chain, or Markov process1.In the amusement park example above, the ride we are currently on can affect the ride that we will get on next, and thus we can regard this process as a Markov chain.It is important to realize that in a Markov process, the outcome of the next experiment is not affected by any outcome other than the outcome of the current experiment.

 

In a Markov process, we consider a set of states and the ways in which we can move from one state to another.To begin the process of moving from one state to another - the Markov process - we begin in some initial state and continue the process by moving from one state to another.Each move from one state to another is called a step.Given some set of states , if we are currently in some state , the probability that our next step will be to another state is given by a probability denoted .These probabilities are called transition probabilities.Since, in a Markov process, the outcome of the next experiment is not affected by any outcome other than the outcome of the current experiment, the probability of moving from some state to another state depends only on the current state and our probability distribution, not on any previous state of the process.Observe that the process can remain in the current state with probability .To start the Markov process, in addition to specifying a particular starting or initial state, we also need to specify an initial probability distribution.We can represent the probability distributions for a given initial state in a square matrix called a transition matrix.

 

Letís return to our amusement park example.Recall that there are 3 total rides in the park:one roller coaster, one water ride, and one boring ride.We can treat each ride as a state and thus our set of states (rides) is where R denotes the roller coaster, W denotes the water ride, and B denotes the boring ride.If we are on a given ride, then our next ride will always be one of the three rides at the park.Suppose that if we are on the roller coaster, then our next ride will be the roller coaster with probability 0.1, the water ride with probability 0.4, or the boring ride with probability 0.5.Notice that since we are always going to want to ride another ride and the roller coaster, water ride, and boring ride are the only rides in the park, our probabilities of next getting on one of these rides must add up to 1.Now suppose that if we are on the water ride, then our next ride will be the roller coaster with probability 0.2, the water ride with probability 0.6, or the boring ride with probability 0.2.Finally, if we are on the boring ride, then our next ride will be a roller coaster with probability 0.35, the water ride with probability 0.35, or the boring ride again with probability 0.3.Our transition matrix is then

 

†††††††††††††††† R†††††††† W††††††† B

 

This transition matrix makes it easy to see the probability of moving from one state (ride) to another.Letting the row of the matrix represent the ride we are currently on and the column of the matrix represent the ride we could next be on, we see, for example, that the probability of moving from the water ride to the boring ride is 0.2.That is, .

 

With this transition matrix at hand, we can now ask questions like ďGiven that we are on the roller coaster, what is the probability that two rides from now we will be on the boring ride?ĒWe denote this probability of being on the boring ride two rides from now, given that we are currently on the roller coaster, by .In order to determine , we need to look at all possible ways to move from the roller coaster to the boring ride in exactly two steps.

 

There are three possible ways for this to happen:(1)we ride the roller coaster twice and then ride the boring ride, (2) we go from the roller coaster to the water ride to the boring ride, or (3) we go from a roller coaster to the boring ride twice.For case (1), we have , for case (2) we have , and for case (3) we have .Thus, our total probability, the sum of each of the probabilities, is

 

 

In general, if a given Markov process has n states, then the probability of moving from some state to another state in two steps is given by1

 

 

Thus, for a general Markov process, we have an efficient way to calculate the probability of moving from a state to another state in two steps.The obvious next question is how to calculate the probability of moving from a state to another state in exactly r steps, where r is any positive integer.Probability arguments and induction lead us to the following result1:

 

(Theorem 1 (Grinstead and Snell Theorem 11.1))Given the transition matrix of a Markov chain, taking the ijth entry of the matrix gives the probability of moving from a state to another state in r steps1.

 

Suppose in our amusement park example that we are on the water ride and we want to know the probability that 5 rides from now, we will be on the roller coaster.Using the convention from the transition matrix P above, we want , the (2,1) entry of the transition matrix P raised to the fifth power.Thus, we have

 

 

 

 

 

 

Given this transition matrix P raised to the fifth power, , we can look at the (2,1) entry of and say that the probability of being on the roller coaster 5 rides after we are on the water ride is 0.222, or, 22.2%.†† Notice that the probability of being on the water ride after 5 rides is 0.482 and the probability of being on the boring ride after 5 rides is 0.296.Thus, we are least likely to be on the roller coaster after 5 rides given that we are currently on the water ride.

 

Absorbing Markov Chains

 

Now consider the following modification to the problem.Suppose that all of the changing stations at the park are closed and we are not allowed to get on any other ride while wet.Thus, once we ride the water ride, we will have to continue riding the water ride until the end of the day.In other words, once we get on the water ride, our process of determining which ride to get on next comes to an end, and we are forced to stay at the water ride.In the language of Markov processes and our transition matrix P, we can say that while and .In general, when we have a given state in which , we call the statean absorbing state, since we cannot leave that state after reaching it.States that are not absorbing are called transient states.We then call any Markov process with at least one absorbing state an absorbing Markov chain if it is possible to reach that absorbing state in some finite number of steps.Note that since it is possible for us to reach an absorbing state from any other state, the probability of eventually reaching an absorbing state is 1.That is, given any absorbing Markov chain, we will eventually reach an absorbing state1.The question we can now ask is how many states we will be able to reach before reaching an absorbing state.That is, we would like to know, on average, how long will it take for the process to be absorbed1.In order to answer this question, we first need to restructure our transition matrix P.

 

We begin by reordering the states in our transition matrix so that the transient states come first.If we have r absorbing states and t transient states, then our transition matrix P will have the following form:

 

††††††††††††††††† Tr.†† Abs.

 

where I is the r x r identity matrix, 0 is the r x t zero matrix, R is a non-zero t x r matrix, and Q is a t x t matrix1.This matrix is now said to be in canonical form1.We now need to make use of the following theorem:

 

(Theorem 2 (Grinstead and Snell Theorem 11.4))Given an absorbing Markov chain and an associated transition matrix in the canonical form described above, the matrix I Ė Q has an inverse N with .The ij-entry of N is the expected number of times the chain is in state given that the process started in state .

 

Since the probability of reaching an absorbing state tends to one, we will always eventually reach an absorbing state and the above theorem guarantees that we can compute N, the inverse of the matrix I Ė Q.A final theorem will allow us to calculate, for each potential starting state, the expected number of states reached before reaching an absorbing state.We will always assume that the Markov process starts in a transient state.

 

(Theorem 3 (Grinstead and Snell Theorem 11.5))Let be the expected number of steps before the chain is absorbed given that the chain starts in the (transient) state , and let t be the column vector whose ith entry is .Then

where c is a column vector all of whose entries are 1.

 

Returning once again to our amusement park example, recall that once we get on the water ride, we must stay on the water ride, so W is an absorbing state and R and B are transient states.Since we know we will have to stay on the water ride once we get on it, letís change our riding plan so that if we are on the roller coaster, we will ride the roller coaster again with probability 0.3, we will ride the boring ride next with probability 0.6, and we will ride the water ride next with probability 0.1.If we start by riding the boring ride, we will ride the roller coaster next with probability 0.45, the boring ride again with probability 0.5, and the water ride next with probability 0.05.Our transition matrix in canonical form is then

 

†††††††††††††††††††† R†††††††† B††††††† W

 

and we can see that

 

 

and thus we have

 

 

 

and we know from theorem 3 that t = Nc where c is a column vector all of whose entries are 1.Thus, we have

 

 

Thus, if we first ride a roller coaster, we will get to ride an average of 13.75 rides before riding the water ride, and if we ride the boring ride first, we will get to ride an average of 14.375 rides before riding the water ride.Thus, starting the day by riding the boring ride will yield the highest number of expected rides on average before reaching the water ride.

 

This amusement park was a simple and admittedly unrealistic example.Yet, it is easy to see that by increasing the size of our transition matrix, this same process could be used to deal with amusement parks with any number of rides.Absorbing Markov Chains are often used to model several important and well-known situations.For example, the random or drunkards walk problem, in which a person leaving a bar can stumbles in a random direction, eventually reaching home, the absorbing state, is often described using absorbing Markov chains.Gamblers playing casino games in which the player is at a disadvantage can be modeled using absorbing Markov chains, where the absorbing state occurs when the gambler is out of money.Markov chains with no absorbing states can be used to predict genetic traits in offspring.

 

There are many more types of Markov chains beyond absorbing Markov chains.A deeper understanding of Markov chains, including the proofs of the theorems stated above, requires a moderate knowledge of probability.In fact, Markov processes are generally viewed from a probability standpoint with assumed minimal background knowledge in linear algebra.Proofs of the above theorems as well as a more thorough introduction to absorbing Markov chains can be found in Introduction to Probability by Charles Grinstead and J. Laurie Snell, the primary source for this page.For those with a significant background in probability, more information on absorbing Markov chains and other types of Markov chains can be found in Probability, Markov Chains, Queues, and Simulation by William J. Stewart, and Non-negative Matrices and Markov Chains by E. Seneta.

 

 

Sources:

1.      Grinstead, Charles M. and J. Laurie Snell. (1997)Introduction to Probability.AMS Bookstore.