Wednesday, June 30, 2010

APPLICATIONS: PRISONER'S DILEMMA, FIRST INSTALLMENT

Part Four

Applications


 

    The time has come to put all of this formal stuff to use. In the second major part of this tutorial, I shall examine a number of attempts to apply the materials of Game Theory and Rational Choice Theory to substantive issues in political theory, economics, military strategy, and the law. My message will in the main be negative. I shall argue, again and again, that authors attempting to gain rigor or clarity or insight by the use of these methods actually misuse them, failing to understand them correctly or failing to understand the scope and nature of the simplifications and abstractions that are required before the materials of Game Theory and Rational Choice Theory can be properly applied.


 

    I have asked you to read two essays and a chapter of a book, all by me, and all available by clicking on the links provided in the blog post of June 2, 2010. In order to move things along and keep this tutorial to a manageable size, I am going to rely on you to do that reading, so that I can refer to it without summarizing it or repeating what I have said in those texts.


 

    My order of discussion will be as follows:


 

    1. A discussion of the Prisoner's Dilemma

    2. A discussion of the Free Rider Problem

    3. An extended and very detailed analysis of the central thesis of John Rawls' A Theory of justice.

    4. A brief discussion of certain arguments in Robert Nozick's Anarchy, State, and Utopia.

    5. A discussion of some of the applications of Game Theory and Rational Choice Theory in Game Theory and the Law by Baird, Gertner, and Picker.

    6. A discussion of the role played by Game Theory in the debates about military strategy and deterrence policy in the United States in the first twenty years following World War II. In connection with this portion of the discussion, I will make available the text of a book I wrote in 1962 but was never able to get published.


 

    Assuming anyone is still with me after all of that, I will entertain suggestions of how we might usefully keep this tutorial going. Alternatively, I can go back to playing Spider Solitaire on my computer. :)


 

The Prisoner's Dilemma


 

    The Prisoner's Dilemma is a little story told about a 2 x 2 matrix. For those who are unfamiliar with the story [assuming someone fitting that description is reading these words], here is the statement of the "dilemma" on Wikipedia:


 

"Two suspects are arrested by the police. The police have insufficient evidence for a conviction, and, having separated the prisoners, visit each of them to offer the same deal. If one testifies for the prosecution against the other (defects) and the other remains silent (cooperates), the defector goes free and the silent accomplice receives the full 10-year sentence. If both remain silent, both prisoners are sentenced to only six months in jail for a minor charge. If each betrays the other, each receives a five-year sentence. Each prisoner must choose to betray the other or to remain silent. Each one is assured that the other would not know about the betrayal before the end of the investigation. How should the prisoners act?"


 

    The following matrix is taken to represent the situation.

    

 

B1 cooperate

B2 defect

A1 cooperate

6 months, 6 months

10 years, Go free

A2 defect [

Go free, 10 years

5 years, 5 years


 

    The problem supposedly posed by this little story is that when each player acts rationally, selecting a strategy solely by considerations of what we have called dominance [A2 dominates A1 as a strategy; B2 dominates B1 as a strategy], the result is an outcome that both players consider sub-optimal. The outcome of the strategy pair [A1,B1], namely six months for each, is preferred by both players to the outcome of the strategy pair [A2,B2], which results in each player serving five years, but the players fail to coordinate on this strategy pair
even though both players are aware of the contents of the matrix and can see that they would be mutually better off if only they would cooperate.


 

    For reasons that are beyond me, this fact about the matrix, and the little story associated with it, is considered by many people to reveal some deep structural flaw in the theory of rational decision making, akin to the so-called "paradox of democracy" in Collective Choice Theory. Military strategists, legal theorists, political philosophers, and economists profess to find Prisoner's Dilemma type situations throughout the universe, and some, like Jon Elster [as we shall see when we come to the Free Rider Problem] believe that it calls into question the very possibility of collective action.


 

    There is a good deal to be said about the Prisoner's Dilemma, from a formal point of view, so let us get to it. [Inasmuch as there are two prisoners, it ought to be called The Prisoners' Dilemma, but never mind.] The first problem is that everyone who discusses the subject confuses an outcome matrix with a payoff matrix. In the game being discussed here, there are two players, each of whom has two pure strategies. There are no chance elements or "moves by nature" [such as tosses of a coin, spins of a wheel, or rolls of a pair of dice]. Let us use the notation O11 to denote the outcome that results when player A plays her strategy 1 and player B plays his strategy 1. O12 will mean the outcome when A plays her strategy 1 and B plays his strategy 2, and so forth. There are thus four possible outcomes: O11, O12, O21, O22.


 

    In this case, O11 is "A serves six months and B serves six months." O12 is "A serves 10 years and B goes free," and so forth. Thus, the Outcome Matrix for the game looks like this:


 

B1

B2

A1

A serves six months and B serves six months

A serves ten years and B goes free

A2

A goes free and B serves ten years

A serves 5 years and B serves five years


 

    Notice that instead of putting a comma between A's sentence and B's sentence, I put the word "and." That is a fact of the most profound importance, believe it or not. The totality of both sentences, and anything else that results from the playing of those two strategies, is the outcome. Once the outcome matrix is defined by the rules of the game, each player defines an ordinal preference ranking of the four outcomes. The players are assumed to be rational -- which in the context of Game Theory means two things: First, each has a complete, transitive preference order over the four outcomes; and Second, each makes choices on the basis of that ordering, always choosing the alternative ranked higher in the preference ordering over an alternative ranked lower.


 

    Nothing in Rational Choice Theory dictates in which order the two players in our little game will rank the alternatives. A might hate B's guts so much that she is willing to do some time herself if it will put B in jail. Alternatively, she might love him so much that she will do anything to see him go free. A and B might be sister and brother, or they might be co-religionists, or they might be sworn comrades in a struggle against tyranny. [They might even be fellow protesters arrested in an anti-apartheid demonstration at Harvard's Fogg Art Museum -- see my other blog for a story about how that turned out.]


 

    "But you are missing the whole point," someone might protest. "Game Theory allows us to analyze situations independently of all these considerations. That is its power." To which I reply, "No, you are missing the real point, which is that in order to apply the formal models of Game Theory, you must set aside virtually everything that might actually influence the outcome of a real world situation. How much insight into any legal, political, military, or economic situation can you hope to gain when you have set to one side everything that determines the outcome of such situations in real life?"


 

    In practice, of course, everyone assumes that A ranks the outcomes as follows: O21 > O11 > O22 > O12. B is assumed to rank the outcomes O12 > O11 > O22 > O21. With those assumptions, since only ordinal preference is assumed in this game, the payoff matrix of the game can then be constructed, and here it is:


 

B1

B2

A1

second, second

fourth, first

A2

first, fourth

third, third

    

    [Notice, by the way, that this is not a game with strictly opposed preference orders, because both A and B prefer O11 to O22. With strictly opposed preference orders, you cannot get a Pareto sub-optimal outcome from a pair of dominant strategies -- for extra credit, prove that. :) ]


 

    That payoff matrix contains the totality of the information relevant to a game theoretic analysis. Nothing else. But what about those jail terms? Those are part of the outcome matrix, not the payoff matrix. The payoff matrix gives the utility of each outcome to each player, and with an ordinal ranking, the only utility information we have is that a player ranks one of the outcomes first, second, third, or fourth [or is indifferent between two or more of them, of course, but let us try to keep this simple.] But ten years versus going scot free, and all that? That is just part of the little story that is told to perk up the spirits of readers who are made nervous by mathematics. We all know that when you are introducing kindergarteners to geometry, it may help to color the triangles red and blue and put little happy faces on the circles and turn the squares into SpongeBob SquarePants. But eventually, the kids must learn that none of that has anything to do with the proofs of the theorems. The Pythagorean Theorem is just as valid for white triangles as for red ones.


 

    To see how beguiled we can be by irrelevant stories, consider the following outcome matrix, derived from a variant of the story we have been dealing with:


 

B1

B2

A1

A serves one day and B serves one day

A serves 40 years and a day and B goes free

A2

A goes free and B serves 40 years and a day

A serves 40 years and B serves 40 years


 

    In this variant, if both criminals keep their mouths shut, they go free after only one night in jail. If they both rat, they spend forty years in jail. If one rats and the other doesn't, the squealer goes free today and the other serves 40 years and a day. Both criminals know this, of course, because the premise of the game is that this is Decision Under Uncertainty, meaning that they know the content of the outcome matrix and of the payoff matrix but not the choice made by the other player. The structure of the payoff matrix associated with this outcome matrix is supposed to be identical with that associated with the original story, namely: For A, O21 > O11 > O22 > O12, and for B, O12 >O11 >O22 > O21, because the premise of the little example is that each player rates the outcomes solely on the basis of the length of his or her sentence, regardless of how long or short that is. It is therefore still the case that O11 is preferred by both players to O22, and it is still the case that IF each player's preference order is determined solely by a consideration of that player's sentencing possibilities [and that each player prefers less time in jail to more], and that each player chooses a strategy solely by attending to considerations of dominance, then the two of them will end up with a Pareto sub-optimal result. But how likely is all of that to occur in the real world? I suggest the answer is, not likely at all. For the upshot of the game to remain the same, we must assume two things, neither of which is even remotely plausible in any but the most bizarre circumstances: First, that each player is perfectly prepared to condemn his or her partner in crime to a sentence of 40 years and a day just to have a chance at reducing a one day sentence to zero; and second, that the two of them, faced with this extraordinary outcome matrix, cannot coordinate on the Pareto Preferred Outcome without the benefit of communication.

3 comments:

  1. This was presented in class on my first day and first class in a university lecture hall. The teacher was the head of the department and was my grad school advisor three years later.

    ReplyDelete
  2. Extra credit attempt: When you have strictly opposed preference orders and all of those axioms discussed earlier, when both participants go with their dominant strategy (if you follow von Neumann and Morgenstern in having that be maximizing the minimum), you will always have an outcome be a saddle-point. Such a point is going to be pareto-optimal because if either player chose anything else, they would have a worse outcome.

    I'm guessing that once you no longer have strictly opposed preference orders, there no longer must be that kind of structure, so you can have these sub-optimal results.

    ReplyDelete
  3. Good show. That is basically right. if you have strictly opposed preference orders, then if some other outcome is better for A, it must be worse for B, since they have opposed preference orders. Hence there canot be some other outcome that is better for both than the one on which they have ended. Very nice.

    ReplyDelete