Now we are ready to start. I will progress from game to game [i.e., from payoff matrix to payoff matrix], making things a little more complicated each time, until we get to the payoff for all of this [so to speak]: a two person mixed strategy zero sum game. I will then explain [but probably not drive you nuts by actually proving] The Fundamental Theorem of Game Theory, von Neuman's own [if I may make a play on a popular line of environmentally friendly spaghetti sauces and such], which says that Every two person zero-sum mixed strategy game has a solution in the strong sense. But I get ahead of myself.
Here is our first little game. Two businesses are competing to produce and sell children's toys, and each one must decide whether to make hula hoops or yoyos [but not both, for technical reasons]. Each player knows everything there is to know about the costs, the market, and such, except what the other player is going to choose to do. Although neither player knows what the other will do, each knows what effect the other's decision will have on the bottom line of each company. The following matrix shows the profit each will make in one of the four possible situations: A makes hula hoops and B makes hula hoops, A makes hula hoops and B makes yoyos, A makes yoyos and B makes hula hoops, or A makes yoyos and B makes yoyos. Now, I know that some of you are philosophers, but I really do not want you wasting your time wondering how they know these things, or what extraneous factors, cultural or otherwise, might affect their decisions. We will get nowhere if you do. Just go with the flow here. This is the payoff matrix for the game. We assume that both players prefer making more money to making less, and do not care about anything else.
B1 make hula hoops | B2 Make yoyos | |
A1 Make hula hoops | 10000, 1000 | 8000, 6000 |
A2 Make yoyos | 9000, 6000 | 6000, 5000 |
A is in a position to make a rational decision, for consider: If A decides to make hula hoops, then no matter which choice B makes, A can be sure of doing better by sticking with hula hoops than by changing to yoyos. As the matrix shows, if B chooses B1, then A1 is better than A2 [because 10000 is bigger than 9000]. If B chooses B2, then A1 is better than A2 [because 8000 is bigger than 6000]. The strategy A1 is thus said to dominate the strategy A2.
Let us from now on use the notation Pij to mean the payoff to A for the strategy pair [Ai, Bj], and the notation Qij to mean the payoff to B for the strategy pair [Ai, Bj]. We can se by looking at the matrix that P11 > P21 and P12 > P22.
Notice several things about this game:
[1] A does not need to know B's payoffs for the four possible combinations. In this very simple case, A's payoff schedule alone is enough to allow the calculation that A1 dominates A2.
[2] As I have several time emphasized, we do not need to know the actual dollar or utility payoffs to carry out this line or reasoning. All we need to know is that P11 > P 21 and P12 > P22.
Now look at the situation from B's point of view. A quick look at the payoff matrix reveals that B cannot use the line of reasoning that solved the choice problem for A. B's preference pattern is (Q12 = Q21) > Q22 > Q11. Strategy B1 does not dominate strategy B2, because Q12 > Q11, and B2 does not dominate B1, because Q12 > Q22. Now B takes an important step forward. Since B knows that A is rational, and since B knows A's payoffs, B reasons that A will choose strategy A1, since it is A's dominant strategy. Knowing that, B now looks at the payoff matrix and sees that with A choosing strategy A1, B's best strategy is B2, because 6000 > 1000.
Thus, assuming that the figures in the payoff matrix are correct, that A is perfectly rational, and that A cares only about maximizing her utility as represented by the figures in the matrix, B can now solve the choice problem by choosing strategy B2, even though neither of the B strategies dominates the other. With this tiny step forward, we begin to develop a theory that takes account of and adjusts for the rational calculations of the other player. In this way, we move beyond the opacity of the marketplace, which is the central point of Game Theory.
The games in which one or the other player has a strictly dominant strategy are relatively rare [and uninteresting]. When neither player has a strictly dominant strategy, we must extend our notion of what constitutes a "rational" choice. Consider the following game with a 3 x 3 payoff matrix. Once again, the preferences are ordinal, and I use numbers simply because it is easy to tell in what order A or B ranks two payoffs.
B1 | B2 | B3 | |
A1 | 3, 9 | 5, 6 | 1, 4 |
A2 | 0, 11 | 6, 2 | 2, 11 |
A3 | 4, 5 | 5, 19 | 3, 8 |
A little careful examination reveals that neither A nor B has a dominant strategy. For example, strategy A2 is not dominant [i.e., does best no matter what B does], because if B chooses B1, A would do better with either A1 or A3. Strategy B3 is not dominant for B because if A were to choose A1, B would be better off with B2 or B1. And so forth.
What are A and B to do in this situation? This is, notice, exactly what we meant by "choice under uncertainty.' A and b know everything about the payoff matrix, therefore everything about possible outcomes and consequences of pairs of strategic choices, but neither of them can be certain what the other will do. In this situation, there are obviously a great many different ways A and B might proceed [leaving to one side for the moment engaging in industrial espionage to find out what the other is thinking.] If they had cardinal utility functions, and hence could say not merely in which order they prefer the possible outcomes but also by how much they do, they might decide to avoid any strategies that have any possibility at all of outcomes they consider really terrible. That might narrow their choices to the point where dominance considerations could be invoked. By the same token, they might decide to give preference to strategies that have at least some possibility of outcomes they prize very, very highly. And of course one might do one and the other the other. With cardinal preference orders, they might even cast about for some way to engage in expected utility calculations, although without any knowledge of the probabilities, that would be very difficult.
At this point, von Neuman and Morgenstern propose an extremely conservative rule for choosing from among the available strategies. They suggest that the players, in effect, try to minimize the damage they may suffer as a consequence of the choices of their opponents. Let us, they say, define something we will call the "security level" of a strategy. The security level is the worst outcome that the strategy can produce. For example, look at strategy A1. Depending on what B chooses, A will get 3, 5, or 1. So her security level for A1, the worst she can do if that is her choice, is 1. By the same reasoning, we can see that the security level of A2 is 0, and of A3 is 3. Correspondingly, B's security levels for his three strategies are 5 for B1, 2 for B2, and 4 for B3. If the players adopt von Neuman and Morgenstern's rule for rational choice, then A will choose A3, which has the maximum security level of her three strategies, and B will adopt B1 for the same reason. The outcome of the game will be the intersection of strategies A3 and B1, which, as we can see from the payoff matrix gives A 4 and B 5.
This pair of strategy choices guarantees at least 4 to A and at least 5 to B. But notice the following fact: There are two other pairs of strategy choices that are better for both A and B. (A1 B2) gives the 5 and 6, instead of 4 and 5, and (A3 B2) gives them 5 and 19 instead of 4 and 5. Now, both A and B can see this, of course, but without what is called pre-play communication or any way of making commitments to one another, they have no way of reaching either of those mutually better [i.e. Pareto Preferred] outcomes. The problem is that if B chooses B2, A may switch to A2 to get 6, leaving B with 2, which is worse than he can guarantee to himself by following the von Neuman Morgenstern rule.
There are lots of games like this -- or, to put it another way, lots of situations which, when analyzed as games, produce payoff matrices with these characteristics. We shall return to them later. [The Prisoner's Dilemma is one example].
Instead, let us take the next step forward in the evolution of the theory. Very often, in a game, A and B have what are called "strictly opposed" preferences over the outcomes. What that means is simply that if you take all the possible outcomes of the game and rank them from best to worst from A's point of view, B will rank them in exactly the opposite order. Somewhat more formally, where Pij and Qij are the payoffs to A and B when A's strategy Ai a is played against B's strategy Bj, and Pqr and Qqr are the payoffs to A and B when A's strategy q is played against B's strategy Br, then:
(i) Pij > Pqr if and only if Qqr > Qij and
(ii) Pij = Pqr if and only if Qqr = Qij
Is the move from game tree to payoff matrix to equation (and inequality) form motivated by anything more than ease of notation?
ReplyDeleteI'm a little behind on my reading of this blog (which I'm addicted to, and I never thought I'd enjoy anything with the words "Formal Methods" in it before) but isn't B3 a dominant strategy as it appears now? Unless I'm reading it wrong, B3 wins for B every time : 1<4, 2<11, 3<8.
ReplyDelete