Sprague-Grundy functions in game theory

I have again spent some time studying game theory a bit further with Thomas S. Ferguson’s book Game Theory. The latest topic was Sprague-Grundy functions that are used to describe, how to choose a winning strategy in combinatorial games. Sprague-Grundy functions are often used to analyze games, when it is not clear which positions in the game are P-positions and which are N-positions. If the Sprague-Grundy functions exist for all the positions in the game, a winning strategy is easily found. For simple games, the Sprague-Grundy offers little added information, since it just provides us the P- and N-positions in the game, but for more complex games it gives information that a simple analysis of P- and N-positions does not provide. I will take a look at this in a later post.

Sprague-Grundy function

The Sprague-Grundy (S-G) function g(n) of node n is defined recursively as follows:

  1. For all terminal nodes g(n) = 0.
  2. For all other nodes g(n) is the smallest non-negative integer that does not belong to the Sprague-Grundy values of the nodes direct followers.

It is easy to show, that the definition of a Sprague-Grundy function is equal to the definition of P- and N-positions in a game.

We remember that P- and N-positions are such positions, that the Previous player to move into or the Next player to move from that position can always win the game. The three definitions of P- and N-positions in combinatorial games are:

  1. The terminal position is a P-position.
  2. An N-position is such that there is at least one possible move into a P-position.
  3. A P-position is such that from it is possible to move only into an N-position.

We will now show that Sprague-Grundy functions fulfill these three conditions, where g(n) = 0 is a P-position and g(n) > 0 is an N-position.

Definitions 1 and A show that for the terminal position the S-G value equals the P-position.

For every other position we have either g(n) = 0 or g(n) > 0, respectively. If g(n) = 0, it means that each predecessor of that node must have an S-G value larger than 0. Therefore, from every position with an S-G value of larger than 0 we can move into a position with an S-G value of 0 (definition 2). Also, if we are not at the terminal node and g(n) = 0, we can only move to positions with an S-G value of larger than 0, since otherwise the position we are in could not have an S-G value of 0 (definition 3).

In conclusion, we see that from positions with an S-G value of larger than 0 we can always move to positions with an S-G value of 0 (equal to definition 2), and from positions with an S-G value of 0 we can only move to positions with an S-G value of larger than 0 (equal to definition 3).  Thus, we have shown that the Sprague-Grundy functions give us the P- and N-positions in a game. The difficulty is that they must be defined recursively, and that they do not exist for “circular paths” in games, where you might end up in a previous position after a certain number of moves.

Backward induction in procurement and mechanism design

About a month ago, I delivered my colleagues a training on backward induction, a method used in game theory to find Nash Equilibria in games that have pre-defined end points. I used materials and examples from the Open Yale course on game theory, since they were illustrative and easy to understand.

After the training, we had some discussion on when one can use backward induction, especially in real life, where we often have open-ended games. During the discussion, I was not able to provide any good example on when to use backward induction, but afterwards a great example came to my mind. Not surprisingly, this example was mechanism design.

Mechanism design is a sub-field of game theory focused on the study of creating and implementing mechanisms and incentives that align the interests of multiple agents. Earlier this year we had an external training on auction design, an application of mechanism design, where the goal is to design such an auction, that the buyer gets the best product for the lowest price. Obviously, auction design can be a powerful tool for any procurement professionals. Using auction design as an example of mechanism design, I will list some delimitations and pre-conditions for backward induction.

Backward induction in business transactions and relationships

Game theory and backward induction can be used in procurement to anticipate potential proposals and prepare counter-proposals, or prepare proposals to direct and distract the other party. Since a negotiation does not necessarily have a clear end from the start, using backward induction extensively can be very difficult, since the potential branches in the decision tree are not limited.

Another aspect in using backward induction is that all parties must be “playing the same game”. For example, if you are concentrated on the immediate outcome of the negotiation, while your opponent is looking at the long-term relationship between your companies, you might be going through very different decision trees with differing optimal outcomes for the immediate outcome. In addition, if the other party does not use backward induction but rather shoots from the hip, you cannot directly expect them to do choose optimally.

In procurement, our job is to create markets and create competition to leverage the market mechanism in buying the best quality we need for a good price. Therefore, mechanism design and game theory can be used to create such a game, a market, where the value to us can be maximized. The idea is to create an auction, where, using backward induction, the interests of the bidders and the buyer are aligned in such a way, that the buyer’s utility is maximized in the auction. Here it is important to make sure, that all the bidders are “playing the same game” and are deducing their actions backwards from the pre-defined end.

There is extensive research on auction and mechanism design. When we have set up the optimal auction design, we face another game, a meta-game, where we have to convince potential suppliers to participate in the bidding or auction, convince them that we are good customers and worth their business. Here we are again playing a long-term game without a pre-defined end, where the parties must convince each other of the benefits of doing business with each other and deliver on that promise. This relationship might also be the corresponding might also be aligned using mechanism design, but due to the scope it is a much more difficult task than a single auction.

Nim and simple mathematical proofs 3/3

In this post I present to you yet another game, Northcott’s Game, and another example of how new problems can be solved using experience and knowledge gained from previous ones. Again, the post is based of Thomas S. Ferguson’s book Game Theory.

Northcotts’s Game is a game where black and white pieces are placed on a checkerboard, one piece of each color on each row (see Picture 1 below). The players take turns in moving their piece on a single row and only on a single row per turn. A piece cannot jump over another piece. The last player to move his piece wins. In its essence, this game is about blocking the other player’s pieces in such a way, that your last move blocks their final piece, ensuring you the win. But how to do that? It turns out that knowing how to play Nim is a huge advantage.

 

Picture 1: Northcott's Game, start position.
Picture 1: Northcott’s Game, start position.

 

 

 

 

 

 

 

Winning in Northcott’s Game

Before going deeper into Northcott’s Game we’ll recall the definition of a combinatorial game. A combinatorial game is such that (see Ferguson’s Game Theory, Part 1, chapter 1, page 4):

  1. It is played by two players.
  2. There is a usually finite set of possible positions in the game.
  3. The rules of the game specify which moves are legal. If both players have the same options of moving from each position, the game is called impartial. Otherwise the game is called partisan.
  4. The players alternate moving.
  5. The game ends when a position is reached from which no moves are possible for the player whose turn it is to move. Under the Normal Play Rule, the last player to move wins. Under the Misère Rule the last player to move loses.
  6. The game ends in a finite number of moves no matter how it is played. (Ending Condition)

It has to be noted that Northcott’s Game is not a variant of Nim. More specifically, it is a partisan game, since both players have distinct moves. Furthermore, Northcott’s game does not satisfy the Ending Condition, since it is possible to keep playing forever. Thus, Northcott’s Game is not a combinatorial game due to not obeying the Ending Condition, but is still related to them. Therefore, knowing how to play combinatorial games, Nim in particular, is helpful in winning in Northcott’s Game.

As often in game theory, we start from the end, using backward induction, and ask ourselves, how the end of the game would look like. To make this easier, Picture 2 presents an abbreviated version of the game in Picture 1, showing only the top 3 rows.

Picture 2: Abbreviated Northcott's Game, start position.
Picture 2: Abbreviated Northcott’s Game, start position.

 

 

 

 

Picture 3 shows a “near penultimate” final position of the game in Picture 2, where the white player has moved his piece in the lowest row next to the black piece in the same row. It is now black player’s turn, and he can only move right, leaving the white player with the final move. In this final move, the white player “blocks” the black player’s last piece and wins the game.

Picture 3: Abbreviated Northcott's Game, "near penultimate" position.
Picture 3: Abbreviated Northcott’s Game, “near penultimate” position.

 

 

 

 

It is clear that the position presented in Picture 3 is a potential P-position. Our next question is, how to get into this penultimate P-position, and this is where Nim steps in.

Play first Nim, then win in Northcott’s Game

The game before the penultimate P-position in Northcott’s Game can be interpreted as Nim with the following parallels:

  • Each row is a pile of sticks.
  • The number of empty squares between the white and the black piece is the number of sticks in a pile.

Now the target is to make the Nim-sum of the squares between the two pieces on a single row zero to reach a P-position, eventually the terminal position. The terminal position in this game is a position where the pieces are next to each other in each row, since then the number of squares between the pieces will have been reduced to zero on each row. The player to have moved last before this position will win, since the next player can only move towards his end of the row, and then the next player, the one who reached the terminal position, can always cover this distance, eventually making the last move and winning the game. This is generalizable to any number of rows.

Here we also see why Northcott’s Game could go on forever. While in Nim a player must always remove a stick from one of the piles, thus eventually leaving no piles on the table, in Northcott’s Game the players could keep moving their pieces back and forth without even trying to block one another. This kind of play would be pointless, but is possible, and therefore Northcott’s Game does not obey the Ending Condition for combinatorial games.

Like Nimble, Northcott’s Game is a good a example of how to use existing models and knowledge to solve new problems, and how existing solutions can be used to represent a solution to a novel problem as a combination of existing solutions, executed as steps one after another. I hope you will have as much as fun with these games and solving them as I am having.

Nim and simple mathematical proofs 2/3

Nimble is a game where there is a row of squares numbered 0, 1, 2, 3, … Each square can contain a coin, and a square may contain multiple coins. The game is played by two players, taking turns in moving a single coin to a lower-numbered square. A player must move and exactly one coin during his turn. A coin may only be mover to a lower-numbered square. The game is won by the player who moves the last coin into square 0. At first this game might look like a distant cousin of Nim, but actually this game is exactly Nim, just in disguise. Let’s see the “proof” for that.

I will present below the rules of Nim and the rules of Nimble to show that the games are identical. Each rule contains the analogous version of the rule for both games.

Rule 1:
Nim: Two players take turns in removing sticks from piles. In the beginning, there are at least three piles.
Nimble: Two players take turns in moving one coin from a square to a lower-numbered square.

Rule 2:
Nim: During his turn a player must remove at least one stick from one pile.
Nimble: During his turn a player must move one coin to at least the next lower-numbered square.

Rule 3:
Nim: A player may remove sticks from one pile only during one turn.
Nimble: A player may move only one coing during one turn.

Rule 4:
Nim: A player to remove the last stick wins
Nimble: A player to move the last coin to square 0 wins.

Rule 5:
Nim: The piles do not interact with each other.
Nimble: The squares and coins do not interact with each other.

From rules 1–5 it is easy to see that Nim and Nimble are the same game with slightly different representations. Especially from rules 2 and 3 we can observe that removing of k sticks from exactly one pile of n sticks and moving exactly one coin from square n to square k are equivalent.

Although this “proof” is hardly very deep-going, or even a real proof for that matter, it shows well how one concept can be presented in different ways and how seemingly different situations can be just different representations of the same idea. This observation is a good reminder for everyday life. When facing novel problems (e.g. Nimble), we should ask ourselves if we could draw parallels to known and familiar problems (e.g. Nim). On the other hand, when looking at existing ideas and concepts (e.g. Nim) we might want to try formulating them a bit differently (e.g. Nimble) to see if we can gain more insight into them.

Nim and simple mathematical proofs 1/3

A few weeks ago I discussed a game called Nim. In studying Nim, with the help of Thomas S. Ferguson’s book Game Theory, I have again found great joy in mathematical thinking, proving theorems and using logical thinking to come to conclusions.

Many people, as far as I know, have some kind of fear of mathematics and mathematical notation. Some people are even proud of their mathematical illiteracy, which is odd, considering that nobody boasts how their Spanish or Chinese skills are non-existent. Working with games and proofs, like those required by Nim, might help people see the value and true beauty of mathematical thinking. People might understand that extensive mathematical notation is not always necessary to prove mathematical theorems, and that mathematical notation is actually just another language that helps us formulate and formalize logical thinking, removing all ambiguity. In this post and two additional ones I will present three proofs related to Nim to show you, how mathematical thinking can be used quite cleverly to show interesting facts.

Bouton’s theorem

In his 1902 article Nim, a game with a complete mathematical theory in Ann. Math. 3, 35-39

L. Bouton proves the following theorem (Theorem 1, Bouton’s theorem): “In a game of Nim a position is a P-position if and only if the Nim-sum of its components is zero.”

A Nim-sum is calculated as follows:

  • The number of sticks in each pile is expressed in base two.
  • These figures are added without carry, so that the sum of each column is 1, if the column contains an odd number of ones. Otherwise, the sum of that column is 0.
  • The Nim-sum is zero when the sum of each individual column is 0.

As an example of calculating the Nim-sum we take three piles with 4, 5, and 8 sticks respectively. The Nim-sum of these piles in base two is:

1000
1010
1000
1001

The result, in base 10, is 9. Since this Nim-sum is non-zero, according to Bouton’s theorem this would be an N-position. It is worth noting that a Nim-sum of two identical positive integers is also zero due to the definition of Nim-sum. Nim-sum is also associative and commutative and 0 is an identity for addition. (see Ferguson’s Game Theory, Part 1, Chapter 1, page 10)

Proving Theorem 1

Proving Bouton’s theorem is partially a constructive proof, relying on the definitions of P- and N-positions. The claim of the theorem is compared against these definitions to show that they are equal. We recall that P- and N-positions are defined as follows:

  • The terminal position is a P-position.
  • An N-position is such that there is at least one possible move into a P-position.
  • A P-position is such that from it is possible to move only into an N-position.

Now we just have to show that the terminal position has a Nim-sum of zero, a P-position always has a Nim-sum of zero and that a Nim-sum of zero is never an N-position. If we can prove this, we will have proven Bouton’s theorem.

Let’s start by observing that the terminal position has a Nim-sum of zero, so condition 1 is fulfilled.

For condition number 2 we can construct such a move, where we can move from a position with a non-zero Nim-sum (a purported N-position) into a position with a Nim-sum of zero (a purported P-position). In our example we do this by reducing the number of ones in the leftmost column with an odd number of ones (the 4th column) so that each column will eventually contain an even number of ones. In our case, there is only one potential move, since there is only one number one in the leftmost column. Thus, the only move into a P-position is to remove 7 sticks from the pile of 8. By this example we see, that we can move from an N-position into a P-position, so condition 2 is also fulfilled. For a more formal and generalized proof for this part, see Bouton’s original article.

The third condition can be shown to be true by the following reasoning. If we are in a position with a Nim-sum of zero, removing any number of sticks from any pile will leave us witha non-zero Nim-sum (a purported N-position). This is because if we remove any number of sticks from a pile, the number of sticks in all the other piles remains the same. If the Nim-sum of the new position were zero, it would imply that the number of sticks remaining in the pile, from which sticks were removed, has stayed the same, but this is a contradiction. Thus, condition 3 also holds.

Collecting together the proofs for conditions 1, 2 and 3 we have shown the following:

  • The terminal position has a Nim-sum of zero.
  • From a position with a non-zero Nim-sum we can move into a position with a Nim-sum of zero.
  • From a position with a Nim-sum of zero it is possible to move only into a potision with a non-zero Nim-sum.

We see that these statements are analogical to conditions 1, 2 and 3 and that a position is a P-position if and only if the Nim-sum of that position is zero.

I hope this post gave you an idea of how simple logical thinking with quite elementary mathematical notation can provide a powerful tool for analyzing problems. In my next post I will discuss a Nim variant called Nimble and show how Nim can come in disguise.

Combinatorial games and Nim

Based on the most recent posts it seems that my blog in transforming into a learning diary on game theory. Because I started this blog as a tool for self-reflection, this change is by no means bad. It might even prove that I have found an area I am passionate about and wish to pursue even further.

In the beginning of July I finished the Open Yale course on game theory and was looking for new material on game theory to study further. I considered buying the course book by Mr. Dutta, but found Mr. Thomas S. Ferguson’s book on game theory as a free Internet version and decided to continue from there.

Mr. Ferguson’s books starts with an introduction on combinatorial games , and on impartial games in specific, something that was not included in deep detail in the Yale Course, so this book seems to have a different approach to the subject. I have started studying combinatorial games in the book and the first exercises have been quite nice, requiring some elementary skills in proving mathematical formulae. Below I introduce one example from the exercises and my work on it, a couple of weeks old now. But first I give a brief introduction on Nim

Nim

Nim is a combinatorial game where two players take turns in removing objects from a given set. In the easiest variation of the game, there is one pile of sticks and each player must remove at least one stick, while the amount of removed sticks may not exceed a predefined maximum. The possible amounts that may be removed from the pile form the subtraction set S. E.g. if one, two or three sticks are permissible moves, then S = (1, 2, 3). Under the normal play rule, the person to remove the last stick wins.

In a game of Nim the winner is decided by the initial amount of sticks and the order, in which the players start the game, presuming that the players play optimally. In Nim there are so called P- and N-positions in the game. These correspond to the number of sticks remaining in the pile after the last move. A P-position is such, that the player who moved previous to the current position, is winning. An N-position is such that the player to move next is winning.

Before going into the general case, a brief example should illustrate the P- and N- positions. As shown in figure 1, we have a game of Nim starting from a pile with 9 sticks and S = (1, 2, 3). Since the player to take the last stick wins, the terminal position (zero sticks left in the pile) is a P-position, since the player who will have moved previously will win. Due to S = (1, 2, 3), positions where there are 1, 2 or 3 sticks left in the pile are N-positions, since the player to move from those positions, or the next player to move, can reach zero and win. Likewise, 4 is a P-position, since from this position a player can only move to positions 1, 2 or 3, thus not being able to win, but leaving the next player in a position to win, should he play optimally. Working backwards to the start position of this example, we see that who ever starts can always win by removing one stick and moving to 8, which is a P-position.

20160717_Nim
Figure 1: An example of Nim.

 

 

 

 

 

 

 

 

Formally, P- and N-positions are defined recursively as follows:

  1. The terminal position is a P-position.
  2. A N-position is such that there is at least one possible move into a P-position.
  3. A P-position is such that from it is possible to move only into an N-position.

The definition of P- and N-positions contains a noteworthy subtlety. From an N-position a move into at least one P-position is guaranteed, but there are also potential moves into further N-positions. Thus being in an N-position does not guarantee winning the game with any play, but a win is guaranteed, if the player in the N-position plays optimally, that is, moves to the available P-position.

A slightly generalized version of Nim

I will prove the general formula for finding the P-positions in a simple one-pile Nim under the normal play rule.

Let the initial number of sticks in the pile be N and let S = (1, 2, …, n), where S has no gaps in it, i.e. for all members of S s(i) it is true that s(1) = 1 and s(i+1) = s(i) + 1 for i = 2, 3, …, n-1.

Since the terminal position is the final and winning position, it is a P-position. Into this position we can move from any position ≤ n. I claim that the P-positions in this variant of Nim are given by the formula p(k) = (max(S) + 1)*k, where k = 0, 1, 2, … The proof follows by induction.

For k = 0 we get p(0) = 0, which is the terminal position, so the formula holds for k = 0. Now I will show that if the formula holds for some k = m, then it also holds for k = (m + 1).

Let us assume that the formula holds for k=n. Then we have a P-position p(m) = (max(S) + 1)*m = (n + 1)*m.

For (m + 1) we get a potential P-position p(m + 1) = (max(S) + 1)*(m + 1) = (n + 1)*(m + 1) = (n + 1) + (n + 1)*m.

Position p(m) can only be reached by removal of sticks from positions that are below p(m + 1) since p(m) + max(S) = n + (n + 1)*m < p(m + 1) = (n + 1) + (n + 1)*m. On the other hand, the mentioned positions can all be reached by removal of sticks from position p(m + 1), since p(m + 1) – max(S) = 1 + (n + 1)*m, and this is the last position before p(m), which is a P-position. Thus by the definition of P- and N-positions, when p(m) is a P-position, so is also p(m + 1), and the positions between these two are N-positions.

Since the proposed formula holds for case k=0 and it also holds for p(m + 1) when it holds for p(m), the formula holds for the general case p(k) q.e.d.

In conclusion, for single-pile game of Nim under the normal play rule, starting with N sticks and with S = (1, 2, …, n), the P-positions are

p(k) = (max(S) + 1)*k, where k = 0, 1, 2, …

Obviously we also required that p(k) =< N for all k.

The first player to move will win the game with optimal play if and only if   N ≠ p(k) for all k. Otherwise the second player to move will win, if he plays optimally. This can be proven quite easily. First we define optimal play as such, that a player will move from an N-position into a P-position, whenever possible. By definition then, optimal play is such that the player will win, if he gets to move first in a game starting from an N-position. Thus, the first player will win with optimal play, if he gets to start from a N-position. Conversely, if the first player wins after starting from N-position he must have played optimally, presuming that player two also plays optimally. Otherwise, at some point in the game, player one would have moved from an N-position into another N-position, leaving player two the possibility to move into a P-position and consequently win the game. Therefore, as stated before, when both players play optimally, the player who starts will win if and only if N ≠ p(k) for all k.

Procurement – on savings and incentives

This summer our department at work published a global guideline on procurement. A colleague of mine did great work on managing a project where existing rules, guidelines, processes and tools were brought together, harmonized and supplemented. The result is a guideline that helps all company employees in procurement activities, whether it is about doing a demand and a supplier market analysis, preparing a tender, negotiating a contract or improving collaboration with existing suppliers.

In the following weeks after the publication, I gave trainings to colleagues from other departments on the most relevant policies and rules, available tools and the benefits of using the provided tools and processes. I was also contacted by some colleagues who were in need of procurement support, e.g. asking for help in initiating a supplier search. Being able to refer to the new guideline and the tools was a huge help for me. It reduced my workload immediately, and the people contacting me were happy to have such a clear guide on what to do and how to do it.

After these experiences, I was obviously convinced that soon we would have more of our colleagues buying more professionally and spending less. After all, procurement is something we do everyday in our personal lives: we buy food, bus tickets, plan buying a house, compare prices and so on. Still, companies have dedicated procurement departments. If procurement is so easy and intuitive, why doesn’t it work in larger companies the way it works in our private lives? Why do sensible people spend money less sensibly, when they are spending company money?

I think there are multiple factors here at play, at least two:

  • Lack of experience and information
  • Lack of incentives

Of these two, I find the latter to be the more convincing one, since it also contributes to the first one, but I’ll first briefly discuss the first point. When we do our everyday shopping, we have experience on what the goods usually cost, who are the market players and what is the quality we should expect. We are also usually quite able to define what we want. For example, I roughly know what a kilogram of apples costs in the nearby grocery stores so I know the market. I also know, what to expect from first class apples, so I am aware of the standard product quality. I can also define my preferences regarding apples, e.g. whether I like sweet or the more sour ones.

When you work for a company, you might often need external goods or services. If you do not purchase the same goods or services often, you lack market knowledge, including prices and the usual quality of the goods. For example, if you need a tax consultant, your knowledge on the competitive situation between consultants, their regular service spectrum and potential services might be unknown to you. Thus, even if you can define in detail what you want and what you are prepared to pay for it, you do not know if that is a good deal on the market. And should you be unable to define your needs and not know what kind of opportunities the market provides, you might be lured by a slick salesman in to buying some nearly related good or service that does not fulfill your needs or contains features you do not need at all. Continuing with the previous fruit analogy, you might not be able to define whether you want pears, prunes or apples and might end up leaving the farmer’s market with a cucumber in your bag.

A logical question is then, why we don’t dig out the necessary information and gather the experience. This is, of course, a matter of cost. The learning curve in the beginning is quite steep, and if you are, for example, a research engineer, the opportunity cost for learning the intricacies of the market for different sensors, for example, might be too high. Thus, we have dedicated procurement functions. But this leads to another question, namely, why do we see maverick buying, that is, people not using selected suppliers and the services of the centralized procurement function to gain market knowledge with smaller opportunity costs? Or why might it be that people do not use provided self-service tools that make tendering and offer comparison much easier, reducing the opportunity cost for the company significantly?

I get a penny, I lose a penny

Even if the most evident opportunity cost of getting competing offers from suppliers and doing solid offer comparisons is reduced, the added benefit for the single employee is not necessarily very high. In the case of the research engineer, getting to know the supplier market for sensors has the opportunity cost of doing research and development work that might lead to new products and income sources for the company. For the engineer there is an opportunity cost of potential bonus if the new products can be patented.

With a centralized procurement function the opportunity cost of the research engineer’s time may be reduced from the company’s perspective, if the engineer can quickly do an offer comparison and thus save a few thousand or more when buying the sensor. The generated savings would compensate the lost time in development work. However, for the engineer there might be nearly no benefit from doing this offer comparison. Assuming he is working on a new product and expecting to get additional remuneration from a potential patent, by doing offer comparisons he is not working towards his expected bonus for patent. Even if he reaches savings in excess of a thousand dollars, his payoff is practically zero, since those savings are either flowing in to the company’s profits or divided between all employees. In a very large company, dividing a thousand dollars between all employees, results in pennies per person.

On the other hand, our development engineer also does not suffer very much from using excess funds for buying the sensor. If he uses a couple of thousand dollars too much, this is again divided between all the company employees, incurring an effective loss of less than a dollar per person in a very large company. This leads us to a free-rider problem, where no one has the incentive to be the one to generate the savings and has therefore little reason not to spend excessively.

The problem here is that employees are not incentivized to use company money as if it were their own, since they usually do not see the effect of personally generated savings: it does not increase their salary or their bonus. Of course, excess saving, and lack of investments, today can endanger a company’s existence tomorrow, so we also a have a temporal problem. This kind of situation is also familiar in game theory, where certain games pose the question, under what circumstances people forego today’s added benefit for the anticipated future payoffs. Unsurprisingly, the present value of the future payoffs have to be large enough in comparison to the temptation created by the payoffs one would receive today. This is a question of time-preferences and the nominal values of the future payoffs.

In order to have employees use company money more effectively and efficiently, we need mechanisms that incentivize such behavior. Procurement professionals are sometimes paid based on achieved savings, but this might also be too naive a solution. For example, if the savings are measure against initial supplier offers, we can be quite sure that the procurement manager always comes with astronomically high initial offers, just to seal the deal with significant price reductions that conveniently guarantee him a good pay.

A mechanism to generate savings

The previously presented problem of aligning the incentives of the company and its employees is a principal-agent problem that can be solved with mechanism design. The principal, or the company, designs the mechanism, in this case the salary or the bonus structure of the employees, in such a way that each employee has the incentive to generate savings. Here it is assumed that the achieved savings exceed the potential opportunity costs for the company and for the single employee. After a quick Google-search I did not find any literature on this specific topic, but the principal-agent problem and mechanism design are broadly studied areas in game theory.

Concluding the previous discussion, a good mechanism that incentivizes all company employees to use the company money responsibly, should have at least the following features:

  • When a person generates savings, he profits from them more than other employees do per head;
  • Compensation for generated savings is at least as high as the employees opportunity cost, e.g. expected patent remuneration fees;
  • Compensation for generated savings is at least as high as the employees The net savings for the company are at least as high as the opportunity cost for the company, e.g. delay in new product development and market launch;
  • Generating long-term savings is more profitable for the individual than quick wins, thus arbitrary reduction of profitable investments is discouraged;
  • Savings are not measured against the initial supplier offer.

Signaling – Show that you can, show that you mean it

My previous post was about complete and incomplete information and about revealing your advantages to your opponents to gain an even bigger advantage. I finished with a short discussion on signaling: How to credibly differentiate yourself from others to gain a higher payoff?

In his course on game theory, Ben Polak represents a good example on signaling by using a simplified model of the job markets. Here I represent it, with possibly different figures, but the idea holds:

  • There are two types of workers only: good and bad
  • 10% of all workers are good, 90% are bad
  • a good worker produces 50 dollars worth of goods per day a bad worker produces only 20 dollars worth of goods
  • employers cannot tell the difference between a good and a bad worker before hiring them
  • the two types of workers are otherwise identical, just their output is different
  • this game lasts only one day, to keep the calculations simple

On average, an employee produces 23 dollars worth of goods, so the average salary level is also 23 dollars. Therefore, the bad workers earn slightly more than they produce, and the good workers a lot less than they produce: a good employee would earn 50 dollars if he could signal credibly to the employer that he is a good worker. Of course, a bad worker would also want to earn 50 dollars instead of 20 or 23. Thus we need something to differentiate between the two, we need a signal.

A signal that differentiates the good workers from bad workers, or any types from one another in general, has to be such that good workers will always give the signal and bad workers will never give the signal. For a signal to be credible, it’s costs obviously have to be such, that they reduce enough the net salary of a bad worker, but do not reduce too much the net salary of a good worker: this way the bad workers will not be willing to give the signal while the good workers will always give the signal.

As an example of a signal, Mr. Polak mentions the possibility of dancing on the table in a job interview and singing a song about how good an employee you would be. Such a signal is obviously costly, being humiliating at the very least, but it does not help differentiate the two types of workers. After all, it is equally humiliating for both types, so even if a good worker would have the incentive to give the signal, the bad worker would have the same incentive, in order to be identified as a good worker and thus receiving the salary of 50 dollars. Clearly not all costly signals can separate the worker types from one another.

Education as a signal

It turns out that education is a form of signaling and conversely, among other things education has a role as a signal giver at the job market. Let’s introduce a two-year MBA that either type of worker can take. The costs of tuition are the same for both and so are those for housing, transport and food. We might argue that the two types have different opportunity costs in taking an MBA instead of working, but both types would earn 23 dollars since we do not yet have a signal to separate them at the job market. So why does the good worker do the MBA and the bad worker doesn’t, as I am proposing? The difference in the costs is the effort, the mental work, hours of sitting in lectures doing homework and assignments. For the bad worker the required effort to finish the MBA degree is much higher than for the good worker. So much, that receiving an MBA would reduce his net salary below current levels, even below 20 dollars.

Of course, in this example the figures can be forced to be in such a relation to one other that the signaling works. E.g. if we make the total costs of an MBA, including the effort, to be 10 dollars per year for the good worker and 20 dollars per year for the bad worker, it is obvious that the good worker will do the MBA and receive a net salary of 30 dollars after being identified as a good worker and hired by an employer. Conversely, the bad worker will not take the MBA, since his net salary for doing an MBA and being identified as a good worker is 10 dollars, which is below the 20 dollars he would receive otherwise.  In addition, the employers have to believe that good workers, and good workers only, take an MBA. Otherwise the good workers might, regardless of their MBA, be identified as bad workers reducing their incentive to take an MBA.

Even if the figures in the above example are arbitrary, the main point is that the signaling mechanism has to be such that it will provide reliable signals and no type has the incentive to deviate. Thus, when creating the mechanism, the related figures actually have to be chosen in such a way that the signaling is reliable and adjusts the payoffs properly. In our MBA example a one-year MBA would not suffice, since a bad worker would do the MBA and receive a net salary of 30 dollars, instead of the 20 dollars he would receive otherwise. On the other hand, a one-year MBA could be made a lot harder and work-intensive, so that the per-year costs are increased and, again, only the good workers go for the education.

Signaling in other areas of life and work

Signaling is not only useful and used in employee-employer relationships. For example, buyer-seller interactions might also require, or at least benefit from, signaling.

For example in the case of used cars information imbalance between the buyer and the seller can lead to all goods in the market being of poor quality. In such a case, the potential sellers of higher quality products would have to be able to reliably signal this quality to the potential buyers. The buyers do not have to signal their preferences, since a buyer looking for higher quality products will buy one, if he gets more value from it, and nobody will pay for a good more than the value received from buying and possessing the good. Thus, we do not have the potential problem of customers looking for low quality products suddenly hoarding all the high quality products.

Another interesting realm where signaling can be applied is the one of procurement. At least larger companies often have a centralized procurement function that is responsible for managing the main suppliers, conducting supplier selection and awarding contracts. When looking for a supplier, the procurement function has to create competition between the candidates to find out the best one for the given quality and specifications. However, a potential supplier must invest resources in the bidding process without any certain revenue or supply contract. If the potential supplier thinks that the expected payoff from the contract is low, he will not put too much effort in to the bidding. This might lead to the customer company receiving only a few offers or the offers being of poor quality and difficult to compare against one another. To increase the number and quality of the offers, the customer company would have to signal, in a reliable way, that the potential contract has a guaranteed, maybe even high value. The way to do this is to commit, before the bidding starts, to awarding the contract to one bidder and to one bidder only, and beforehand abstaining from any cherry picking between offers. This of course requires that a large enough pool of bidders is invited and that the background research on them has been done, since procurement will have to commit to one of the invited bidders. Therefore, their capabilities have to match the requirements well enough, so that in principle any solution could be accepted.

As we see, tying your hands or incurring costs to show your commitment and capability are some ways to reliably signal who you are. Consequently signaling can help you leverage those advantages that might otherwise go unnoticed. Information is power, and shared information can be overpower.

Incomplete and complete information – Strengthen your advantage by revealing it

I have nearly finished the Open Yale course on game theory. Just the final exam is still to be done. Although my exam won’t be graded, I’ll use the opportunity to tackle those problems with the new tools and mind-set and see, what I have learned and understood.

The last topics in the course were about incomplete information and signaling. In game theory incomplete information refers to games, where at least one player is uncertain of the other players’ so called types. A type describes a player’s preferences, available strategies and payoffs. For example, in a real life negotiation we often have incomplete information, being uncertain how tough the opponent is and how he values the potential outcomes. A related concept, imperfect information, we have already encountered when a player does not know, which strategy his opponent has played, although he knows all the potential ways the game could be played and the related preferences and payoffs.

Incomplete information is a realistic condition and thus often encountered when modeling real world phenomena. Often incomplete information is also seen as an asset from the informed party’s point of view: for example, a company knowing both its own and its competitor’s cost structure is intuitively thought to have an advantage, since it knows more, and information is power, as the cliché goes. But what exactly is the benefit of knowing your competitors’ costs and how valuable is it? A further question, with an even less intuitive answer, is, whether such a better-informed company would actually profit from making its own cost structure public. In the following, I will answer these questions from the perspective of the Cournot duopoly model, but taking it a bit further from the standard treatment.

The following is based on the Open Yale course Econ 159a Game Theory and on Strategies and Games: Theory and Practice, 1999 by Prajit K. Dutta.

Incomplete information in a Cournot duopoly

In the standard Cournot duopoly two profit-maximizing companies manufacture substitute products with identical constant marginal cost and serve the same market. The outcome of the model is that in equilibrium both companies produce the same amount of goods, in total more than the monopoly quantity but less than the quantity in free-market competition. Consequently, the companies also get higher than free-market prices and profits, but not the monopoly prices or profits due to the competition between the two rivals.

In an incomplete information Cournot duopoly at least one company (I will only consider this case in the following) has its marginal cost different from those of its competitor’s and this exact cost is unknown to the competitor, while the competitors marginal cost is known by both parties. On average, the company with non-public marginal cost has the same marginal cost as its competitor and the competitor knows this. In equilibrium, the company with the unpublicized marginal cost produces more, reducing the market price, and gets higher profits when its marginal cost is lower than the average. If its marginal cost is above the average, its produced quantity, the price and its profit move to the opposite directions. Somewhat surprisingly, if the company can without costs and credibly reveal its lower marginal cost to the competitor, it will benefit even more, while a company with above average marginal cost would suffer even more.

In table 1 I have summarized the produced quantities, prices and profits to show that revealing its lower than average cost structure in a Cournot duopoly is indeed beneficial for a company. The intuition is the following. When the low-cost producer (company 2) does not reveal its marginal costs, the competitor (company 1) reacts based on the average marginal cost, producing the standard Cournot-quantity. This is the logical reaction, since on average company 2 has the same marginal cost as company 1. However, company knows its own cost structure and produces more than the standard Cournot-quantity, since this is profitable due to the lower than average marginal cost. Likewise, a high-cost company 2 would produce less than the standard Cournot-quantity, but would not suffer from its competitor’s producing more than its standard Cournot-quantity, since again company 1 is reacting as if company 2 had the average marginal cost.

If company 2 could credibly make its lower than average marginal cost public, company 1 would know that company 2 can and will produce more due to its lower marginal cost, driving the price down. In this case company 1 would react to the actual lower than average marginal cost of company 2, not on company 2’s expected average marginal cost. Consequently, company 1 will produce less to counter the price erosion and maximize its margins in this situation. Also, knowing the reaction of company 1, company will produce more, making also higher profits than in the case of incomplete information. Correspondingly, a high-cost producer will also suffer more if its cost structure becomes public, so it will try to keep this information secret.

Table 1 summarizes the Cournot-quantities, prices and profits in different cases of a Cournot duopoly, with companies 1 and 2, company 2 being the low- / high-cost producer who always knows company 1’s constant marginal cost. In the table I have used the following notation:

  • P = a – bQ > 0, where P is the unit price a and b are non-negative constants
  • Q = Q1 + Q2, where Q1 and Q2 are the quantities produced
  • c is the average marginal cost
  • ε is the difference of the low-/high-cost company’s from the average marginal cost
  • c + ε > 0
  • a low-cost company 2 has ε < 0, a high-cost company 2 has ε > 0

I have also used P’, P’’ and similar expressions for the quantities to separate the cases of standard Cournot duopoly from those of incomplete and complete information with company 2 having lower or higher than average marginal cost.

Cournot_table

From table 1 it becomes clear that profits for company 2 increase (decrease) for negative (positive) values of ε, when we move from the standard Cournot duopoly to a duopoly with different marginal costs between companies with incomplete and finally complete information. Thus, a cost advantage is strictly profitable and making it public increases the profit. The profits of company 1 decrease (increase) for negative (positive) values of ε,  when moving from the standard Cournot duopoly to a duopoly with different marginal costs between companies with incomplete information. Thus, a cost disadvantage of one company is strictly profitable for the other companies and their profits increase with this cost disadvantage. With some algebra, it can be shown that company 1’s profits in the case of complete information are lower than in the case of incomplete information and that the difference is (ε/6)*(a + c + 2ε), which is negative (positive) for negative (positive) values of ε.

Information cascading and revelation

As argued previously, a low-cost producer in a Cournot duopoly has the incentive to make its cost structure public to maximize its benefits, i.e. profits. More precisely, the producer with the lowest marginal cost has this incentive, since non-disclosure would lead the competition treating the company as an average-cost producer. Therefore, the producer with the lowest marginal cost will reveal its costs. Now, if there are more competitors in the market, the producer with the second lowest marginal cost also has the incentive to reveal its costs, although they are higher than those of the cost leader. If the producer with the second lowest marginal cost did not reveal its costs, it would now be treated as an average producer in the remaining group of companies: the lowest cost producer has now been excluded from the average, since its cost structure is known. But being treated as an average-cost producer is clearly sub-optimal, if a company’s marginal cost is below the average. Thus, the producer with the second lowest marginal cost will also reveal its costs. This logic can be followed right to the last company on the market, to the one with the highest marginal cost.

When one company, the one with the lowest marginal cost, reveals its costs, this leads to information cascading, since the companies with the next lowest costs now want to differentiate from competitors with higher marginal cost, even if they have costs above the average over all companies at the market. This makes the marginal costs of all companies public, or at least their relation to one another. The last company is evidently going to be the one with the highest costs, since it wishes to stay hidden among the masses and has thus not yet revealed its costs. It follows that the last company does not have to reveal its costs; the competition will be able to infer, that the last company has the highest marginal cost and will react accordingly, even if not knowing the exact costs.

The dog didn’t bark – the value of undisclosed information

The previous exercise shows an important, broader real-world application of revealing information. The absence of evidence or the absence of information can be a substantial piece of information. If a company does not want to reveal its cost structure, it is likely due to its high (marginal) costs, at least in the context of the Cournot duopoly model. But not all markets are like the Cournot model, so the implications are not completely generalizable. But still, even in a free market where all companies are price takers, revealing your costs might help you; not in gaining market share, since your produced quantities do not affect the prices, but in keeping competitors from entering price wars. By revealing your costs, you can potentially indicate that you have the largest margin and can thus come out on top, should a competitor start a price war. But by revealing your costs, you may be able to keep your opponents at bay, since they know in advance that their chances of winning a price war are slim.

The difficulty in revealing the costs and gaining the associated advantage is that it is not always evident, which company has the lowest costs. Therefore, companies may restrain from revealing their costs, even if they would benefit from their low costs more when if became public. But without knowing that it is a low-cost producer a company risks becoming disadvantaged, should it in fact be a high-cost producer.

In the case of very similar consumables and bulk goods (e.g. oil, agriculture products) good estimates on the relative costs among competitors may be made and thus own cost advantages can be made public to keep the competition from starting a price war. For highly specialized and small-series products with small markets, revealing a cost advantage would give the power to affect the demand, prices and quantities produced, but finding out who is the low-cost producer might be more difficult. Specialty products often enjoy higher margins, so that product prices are poor indicators of true costs.

In conclusion, revealing information to your rivals may give you an advantage, and not revealing information already conveys information about your situation. Furthermore, even if you are not in the best position in a group of competitors, being able to separate yourself from the even weaker ones may give you an advantage, and to do this you must credibly signal that you have an advantage over some of the competitors. This will then lead to all with a relative advantage to revealing this information.

Signaling is yet another topic in game theory and discusses how different types of players, e.g. good and bad workers, can credibly be identified: an employer has an incentive to get the good and, presumably, more productive good workers, and the good workers have the incentive to reveal themselves in hopes of a higher salary. But the bad workers also have an incentive to be taken for good workers, due to the potential higher salary. Therefore, if the higher workers are to earn more and employers are to pay “correct” wages to the two types of workers, we need a credible signal that the good worker can and will give, but the bad worker cannot or does not want to give. I will return to this topic in my next post.

Reducing car traffic – a game theoretic view

Last week I wrote about the Slow-up event and the question of reducing car traffic. I also mentioned that reducing traffic is a question of incentives.

In Switzerland, the citizens will be voting on June 5th over a motion to direct the currently collected taxes on fuel in full to the upkeep of the road network. As in many EU countries, the different taxes, including a mineral oil tax or an excise duty and the value-added-tax, the total price of fuel in Switzerland contains over 50% taxes.

The motion argues that by directing the collected fuel taxes to the upkeep of the road network clear benefits will be gained:

  • streets will become safer in the living areas as heavy traffic is rerouted to new routes avoiding these areas
  • traffic jams can be reduced by building more capacity. This will also reduce the costs of lost time spent in traffic jams.

Both listed benefits focus on the aspect of added capacity, since adding capacity supposedly directs traffic outside of living areas and also reduces traffic congestion.

Added capacity does not necessarily reduce traffic jams but adds throughput

At first the adding of capacity seems like an intuitive answer to rerouting traffic and reducing traffic jams, but a game theoretic analysis shows this is not necessarily the case.

With an existing capacity of the traffic network an individual person has the choice of using his own car, taking public transport or postponing (or even cancelling) his trip. As long as the time TC by own car is the same or lower as the time TB used by taking public transport, the individual will take his own car. Also, when the traffic starts jamming (or the person expects traffic jams), he will postpone or even cancel his trip, if the value T of the time lost in the traffic jam is higher than the gained benefit B of making the trip. As the traffic consists of multiple individuals making these evaluations, the amount A of traffic on the streets is limited by inequalities TC < TB and T < B. As long as these two inequalities hold, more people will be taking their own car and will be making their trips, instead of postponing or cancelling them. On balance people are indifferent between their options, each according to their individual valuations.

I am assuming here that most people have roughly the similar values of TC, TB, T and B: since most people are “ordinary” (in the central portion of the bell-curve) in many aspects, it is reasonable to assume that they value their time quite similarly. This means that while an individual person might choose co postpone his trip. because the traffic network is already full, he might do the same try another day, thus “forcing” another person with the same values of T and B to abstain from traveling that day. It is worth mentioning that some individuals with extremely high values of B might never end up postponing their trip.

Now, let’s observe what will happen, when the capacity of the road network is increased to C2 > C. It immediately follows that the travel time, whether by own car or by public transport is lowered, since there is more space for the current amount of traffic A. Also, taking own car becomes more profitable and the inequality TC < TB will hold , since the public transport is always slowed down by its fixed routes and required stops. Since the use of own car becomes more profitable and the value of making the trip exceeds the value of postponing or cancelling it, more trips will be made by own car and the total amount of trips made will increase. This increase will continue, until the inequalities TC < TC and T < B no longer hold, at which point people as a whole are again indifferent between their respective choices.

Now we have a larger amount of traffic A2 > A (i.e. vehicles on the road) than before and people are using the same amount of time for their travel, since the increase in traffic takes place at the margin (T <  B and TC < TB) where we have a lot of people with similar preferences. When, for some people, TB > TC and T < B, they take public transport, so the increased capacity contributes to the use of own car and the total amount of people traveling. The increase in road capacity did not reduce the congestion or travel times, but just increased the use of own car and the number of trips made.

Changing the payoffs changes traffic

We observed that with added capacity of the road network, people will increase the use of own car and will complete their trips as long as TC < TB and T < B. This observation directly points out, that changing behavior requires changing people’s payoffs. It is also necessary to be careful about, whose payoffs should be changed, to have the desired effect. To observe a change in a person’s equilibrium strategy-mix, the payoff’s of others must to have changed, so that the first person is again indifferent between his choices in the new equilibrium. If, as the mentioned motion suggests, traffic is to be reduced, each individual person would be traveling less on average in equilibrium. It then follows that the benefits of traveling in relation to its costs would have to be lower than initially.

A strategy-mix can be interpreted as:

  • the probability of a single person choosing a specific pure strategy
  • a person’s expectations towards the other players’ choosing a specific pure strategy
  • a portion of players playing a specific pure strategy.

In the case of traffic jams, the first and third interpretation are the most obvious ones. We would a observe a person traveling less, when all other people were indifferent between traveling or postponing their travel after their payoff for traveling would have been reduced. Likewise, we would observe people, to whom travel is less valuable, traveling less, if at all, while people, to whom travel is more valuable, would still be traveling.

Increase costs, decrease traffic

To reduce traffic jams, the cost of trips made has to be higher in comparison to the value gained from making the trips, if the amount of traffic is to be reduced. For example, by imposing road tolls such that their costs exceed the value gained from taveling, some people will choose not to travel or postpone their travel to a time, when the collected road tolls are lower or the value of the trip is higher. Here it has to be observed that collecting road tolls on some roads only redirects traffic to roads without tolls. This movement continues, until the value of the time lost in traffic jams on the roads without tolls equals the cost of the road tolls, at which point people are indifferent between using a toll road or one without tolls. To decrease traffic overall, its costs relative to the value gained have to increase on all possible routes.

From the motion’s viewpoint this means that in order to reduce traffic and therefore increase the traffic safety in living areas, heavy traffic would have to be more costly in those areas. Another option would be to ban heavy traffic in those areas, but this would lead to potential congestion and too much traffic on other routes. Just adding capacity by building side routes means that more traffic in total can pass through, but in equilibrium the amount of traffic through the living areas will be such, that it takes as long to reach the destination as it would take by taking the route avoiding the living areas.