Functions and mental health

Going through my old university mathematics book (ISBN 952-91-9157-X), I read a chapter introducing and discussing functions. At the end of the chapter (see page 163), I saw an interesting exercise that took an interesting perspective on using functions to think about relations between everyday things.

I first introduce a definition of a function, adapted from the book, and then the mentioned exercise, also from the book.

A definition of a function:
A function is a triple (set A, rule f, set B). This means that we have a rule f that pairs each member of set A with exactly one member of set B.

This “pairing” (or mapping) is written as y = f(x), y B for every x A. We might also write f: A → B which means that rule f maps set A, f’s domain, to B.

As we can see from the definition, pairing multiple, different members of set A to set B is not precluded, but each member of A is paired with one and only one member of set B. In addition, it may be that not all members of set B have a pair in set A. I’ll now turn to the exercise.

Psychiatrist’s reception as a function

A psychiatrist’s reception can be modeled as a triplet (psychiatrist, patient, diagnosis). Under which assumptions are the following cases functions?

a) psychiatrist: patients →  diagnoses
b) patient: psychiatrists   diagnoses
c) psychiatrist: diagnoses   patients
d) patient: diagnosis   psychiatrists

We we’ll examine each case from a) to d) individually. In general, based on the notation introduced with the definition of a function, each of cases a) to d) have the following form:

rule: Set A →  Set B

Our chosen triplet (psychiatrist, patient, diagnosis) has to be quite restricted, if cases a) to d) are to be functions. First, a psychiatrist is not allowed to change his diagnosis for a specific patient, since this would preclude mapping the respective members of Set A each to a single member of Set B, as we we’ll soon see. In addition, Set A has to be quite restricted, if all its members are to be mapped. We therefore have one assumption that pertains to all cases a) to d):

Assumption 1 (A1):
A psychiatrist cannot change his diagnosis for a specific patient.

and another assumption whose exact form depends in the respective case:

Assumption 2x (A2x, where x = some case a)-d)):
Set A is limited so that each of its member is mapped to Set B.

Cases a) to d) as functions

Starting with case a), a psychiatrist maps every patient to a diagnosis, potentially multiple patients to the same diagnosis. Assumption A2a) limits the set of patients to those patients that the psychiatrist has diagnosed. The psychiatrist has given every patient some diagnosis, potentially the same to each one.

In case b), a patient maps every psychiatrist to a diagnosis, potentially multiple psychiatrists to the same diagnosis. Assumption A2b) limits the set of psychiatrists to those that have diagnosed the patient. Each psychiatrist may have given the patient the same diagnosis, but not necessarily.

In case c), a psychiatrist maps every diagnosis to a patient, potentially multiple diagnoses to the same patient. Assumption A2c) limits the set of diagnoses to those that the psychiatrist has given to some patient. Each diagnosis may have been given to the same patient, but not necessarily.

In case d), a patient maps every diagnosis to a psychiatrist, potentially multiple diagnoses to the same psychiatrist. Assumption A2d) limits the set of diagnoses to those that the patient has received from some psychiatrist. Each diagnosis may have been given by the same psychiatrist, but not necessarily.

As we can see from the cases above, a large (> 1 member) Set A may give us some interesting real-life consequences in each case:

  • In case a) the psychiatrist may give each patient the same diagnosis.
  • In case b) multiple psychiatrists may all give distinct diagnoses to the same patient.
  • In case c) the psychiatrist may give multiple diagnoses to the same patient.
  • In case d) a patient may receive multiple diagnoses from multiple psychiatrists or from just one.

Now we have defined, under which assumption cases a) to d) are functions. At the same time, we see that these assumptions are not enough to guarantee that our functions correspond to real life. For our function to correspond to real life, we would intuitively like the following conditions to be true:

  • A single patient gets a single, unchanged diagnosis from a single psychiatrist on each visit on the short term. (C1)
  • A single patient gets the same (or almost the same) diagnosis from different psychiatrists on the short term. (C2)
  • A psychiatrist’s diagnosis may change on the long term. (C3)
  • A change in a psychiatrist’s diagnosis can be very dramatic compared to the previous diagnosis. (C4)
  • Similar patients get similar diagnoses. (C5)

A more realistic function

As we saw in the analysis before we have two assumptions that define whether cases a) to d) are functions. Assumption 2 cannot be said to be either realistic or unrealistic, since we only define Set A, which is just an arbitrary choice. Assumption 2 merely limits the usage and potential scope of our function: the more limited our Set A, the fewer are the situations where the respective case is a function.

Assumption 1, however, can be evaluated to be more or less realistic. A psychiatrist changing his diagnosis over time is not something we can define, rather it is a given that is either true or false: between now and a given point of time in the future a psychiatrist either has or hasn’t changed his diagnosis of a patient.

We see that A1 can be evaluated as being realistic or not, depending or whether we are considering a shorter or a longer period in time. On the short term it sounds reasonable to assume that a psychiatrist’s diagnose of a single patient does not change (C1), while on the long term we might assume some changes (C3), especially if the patient has some mental condition. Even for currently mentally healthy people A1 is not necessarily true in the long term, since a person might develop a mental disorder (C4). However, we can overcome the long-term limitation of A1 by creating a composite function for the long-term perspective (C3). This function consists of the respective short-term functions, whose time horizons do not overlap but form a continuum. Thus, this long-term function allows a psychiatrist to change his diagnosis of a patient over time, but for shorter periods, the diagnosis stays unchanged.

By introducing time as a further variable we have redefined our triplet (psychiatrist, patient, diagnosis) as a new triplet (psychiatrist, patient, diagnosis(time)), where the diagnosis is a function of time. By defining psychiatrist = psychiatrist(i) where i is the index of the respective psychiatrist we also introduce C1 into our triplet, thus creating a triplet (psychiatrist(i), patient, diagnosis(time)). In order to consider C5, we also adjust our original triplet in such a way. that each patient is also a function of his attributes that define his mental health, which then influence the respective diagnosis.

Taking case a) now as an example for using the new triplet we observe case a)*:

a)* Psychiatrist(i): patients(attributes)  diagnoses(time)

In case a)* a psychiatrist gives each of his patients always the same diagnosis on the short term, but is allowed to give a different diagnosis on the long term. Previously we defined that the diagnosis remains unchanged on the “short term” but did not define short term. In fact, we could define our time intervals to be infinitesimally short. In this case, the psychiatrist would be allowed to change his diagnosis as quickly as he can, but this is hardly reasonable. As a gut feeling, I would say that our time interval, during which the diagnosis has to stay the same, could be anything between months and years, depending on the patient. Thus, we run into quite subtle questions when deciding, whether we have defined a function or not. Additionally, we would have “patient” as a variable when defining the time interval, thus ending up with diagnosis = diagnosis(time(patient)), making our function even more complicated. Actually, a more clear notation for our new function would be

a)*’ Time: Attributes (Psychiatrist(i): patients(attributes) → diagnoses)

Now a given point in time defines which attributes belong to a specific patient. Then this patient receives from a psychiatrist a specific, unchanged diagnosis within the defined time period, and patients with similar attributes receive the same diagnosis. This seems more like what we would expect to see in real life.

I conclude by saying that case a)* also allows a patient to get different diagnoses from different psychiatrists. For C2 to hold we must also require that the difference between the diagnoses provided by two different psychiatrists is always “small” enough. Here “small” means, roughly speaking, that each psychiatrist could agree on the main points of the other’s diagnosis, while they might have a difference of opinion on some details.

Final thoughts

I was thinking about writing just a few chapters on this but it turned out a bit longer, and I could have written even more. This whole exercise got me thinking more about what function are and how we define them. Especially the properties of nested function like diagnosis = diagnosis(time) and patient = patient(attributes) awoke my interest, since they show how careful we have to be when defining mappings and boundary conditions for them when modeling the real world. It is also clear that at some point we have to make simplifications, if we want to have a usable model and avoid getting entangled in endless nested functions.

I would say that case a*’) as a function captures quite well the essence of psychiatrist’s reception, while also considering potentially multiple psychiatrists, patient attributes and temporal changes in these attributes. The model is surely not perfect, but to me it’s a good start.

Nim and simple mathematical proofs 3/3

In this post I present to you yet another game, Northcott’s Game, and another example of how new problems can be solved using experience and knowledge gained from previous ones. Again, the post is based of Thomas S. Ferguson’s book Game Theory.

Northcotts’s Game is a game where black and white pieces are placed on a checkerboard, one piece of each color on each row (see Picture 1 below). The players take turns in moving their piece on a single row and only on a single row per turn. A piece cannot jump over another piece. The last player to move his piece wins. In its essence, this game is about blocking the other player’s pieces in such a way, that your last move blocks their final piece, ensuring you the win. But how to do that? It turns out that knowing how to play Nim is a huge advantage.


Picture 1: Northcott's Game, start position.
Picture 1: Northcott’s Game, start position.








Winning in Northcott’s Game

Before going deeper into Northcott’s Game we’ll recall the definition of a combinatorial game. A combinatorial game is such that (see Ferguson’s Game Theory, Part 1, chapter 1, page 4):

  1. It is played by two players.
  2. There is a usually finite set of possible positions in the game.
  3. The rules of the game specify which moves are legal. If both players have the same options of moving from each position, the game is called impartial. Otherwise the game is called partisan.
  4. The players alternate moving.
  5. The game ends when a position is reached from which no moves are possible for the player whose turn it is to move. Under the Normal Play Rule, the last player to move wins. Under the Misère Rule the last player to move loses.
  6. The game ends in a finite number of moves no matter how it is played. (Ending Condition)

It has to be noted that Northcott’s Game is not a variant of Nim. More specifically, it is a partisan game, since both players have distinct moves. Furthermore, Northcott’s game does not satisfy the Ending Condition, since it is possible to keep playing forever. Thus, Northcott’s Game is not a combinatorial game due to not obeying the Ending Condition, but is still related to them. Therefore, knowing how to play combinatorial games, Nim in particular, is helpful in winning in Northcott’s Game.

As often in game theory, we start from the end, using backward induction, and ask ourselves, how the end of the game would look like. To make this easier, Picture 2 presents an abbreviated version of the game in Picture 1, showing only the top 3 rows.

Picture 2: Abbreviated Northcott's Game, start position.
Picture 2: Abbreviated Northcott’s Game, start position.





Picture 3 shows a “near penultimate” final position of the game in Picture 2, where the white player has moved his piece in the lowest row next to the black piece in the same row. It is now black player’s turn, and he can only move right, leaving the white player with the final move. In this final move, the white player “blocks” the black player’s last piece and wins the game.

Picture 3: Abbreviated Northcott's Game, "near penultimate" position.
Picture 3: Abbreviated Northcott’s Game, “near penultimate” position.





It is clear that the position presented in Picture 3 is a potential P-position. Our next question is, how to get into this penultimate P-position, and this is where Nim steps in.

Play first Nim, then win in Northcott’s Game

The game before the penultimate P-position in Northcott’s Game can be interpreted as Nim with the following parallels:

  • Each row is a pile of sticks.
  • The number of empty squares between the white and the black piece is the number of sticks in a pile.

Now the target is to make the Nim-sum of the squares between the two pieces on a single row zero to reach a P-position, eventually the terminal position. The terminal position in this game is a position where the pieces are next to each other in each row, since then the number of squares between the pieces will have been reduced to zero on each row. The player to have moved last before this position will win, since the next player can only move towards his end of the row, and then the next player, the one who reached the terminal position, can always cover this distance, eventually making the last move and winning the game. This is generalizable to any number of rows.

Here we also see why Northcott’s Game could go on forever. While in Nim a player must always remove a stick from one of the piles, thus eventually leaving no piles on the table, in Northcott’s Game the players could keep moving their pieces back and forth without even trying to block one another. This kind of play would be pointless, but is possible, and therefore Northcott’s Game does not obey the Ending Condition for combinatorial games.

Like Nimble, Northcott’s Game is a good a example of how to use existing models and knowledge to solve new problems, and how existing solutions can be used to represent a solution to a novel problem as a combination of existing solutions, executed as steps one after another. I hope you will have as much as fun with these games and solving them as I am having.

Signaling – Show that you can, show that you mean it

My previous post was about complete and incomplete information and about revealing your advantages to your opponents to gain an even bigger advantage. I finished with a short discussion on signaling: How to credibly differentiate yourself from others to gain a higher payoff?

In his course on game theory, Ben Polak represents a good example on signaling by using a simplified model of the job markets. Here I represent it, with possibly different figures, but the idea holds:

  • There are two types of workers only: good and bad
  • 10% of all workers are good, 90% are bad
  • a good worker produces 50 dollars worth of goods per day a bad worker produces only 20 dollars worth of goods
  • employers cannot tell the difference between a good and a bad worker before hiring them
  • the two types of workers are otherwise identical, just their output is different
  • this game lasts only one day, to keep the calculations simple

On average, an employee produces 23 dollars worth of goods, so the average salary level is also 23 dollars. Therefore, the bad workers earn slightly more than they produce, and the good workers a lot less than they produce: a good employee would earn 50 dollars if he could signal credibly to the employer that he is a good worker. Of course, a bad worker would also want to earn 50 dollars instead of 20 or 23. Thus we need something to differentiate between the two, we need a signal.

A signal that differentiates the good workers from bad workers, or any types from one another in general, has to be such that good workers will always give the signal and bad workers will never give the signal. For a signal to be credible, it’s costs obviously have to be such, that they reduce enough the net salary of a bad worker, but do not reduce too much the net salary of a good worker: this way the bad workers will not be willing to give the signal while the good workers will always give the signal.

As an example of a signal, Mr. Polak mentions the possibility of dancing on the table in a job interview and singing a song about how good an employee you would be. Such a signal is obviously costly, being humiliating at the very least, but it does not help differentiate the two types of workers. After all, it is equally humiliating for both types, so even if a good worker would have the incentive to give the signal, the bad worker would have the same incentive, in order to be identified as a good worker and thus receiving the salary of 50 dollars. Clearly not all costly signals can separate the worker types from one another.

Education as a signal

It turns out that education is a form of signaling and conversely, among other things education has a role as a signal giver at the job market. Let’s introduce a two-year MBA that either type of worker can take. The costs of tuition are the same for both and so are those for housing, transport and food. We might argue that the two types have different opportunity costs in taking an MBA instead of working, but both types would earn 23 dollars since we do not yet have a signal to separate them at the job market. So why does the good worker do the MBA and the bad worker doesn’t, as I am proposing? The difference in the costs is the effort, the mental work, hours of sitting in lectures doing homework and assignments. For the bad worker the required effort to finish the MBA degree is much higher than for the good worker. So much, that receiving an MBA would reduce his net salary below current levels, even below 20 dollars.

Of course, in this example the figures can be forced to be in such a relation to one other that the signaling works. E.g. if we make the total costs of an MBA, including the effort, to be 10 dollars per year for the good worker and 20 dollars per year for the bad worker, it is obvious that the good worker will do the MBA and receive a net salary of 30 dollars after being identified as a good worker and hired by an employer. Conversely, the bad worker will not take the MBA, since his net salary for doing an MBA and being identified as a good worker is 10 dollars, which is below the 20 dollars he would receive otherwise.  In addition, the employers have to believe that good workers, and good workers only, take an MBA. Otherwise the good workers might, regardless of their MBA, be identified as bad workers reducing their incentive to take an MBA.

Even if the figures in the above example are arbitrary, the main point is that the signaling mechanism has to be such that it will provide reliable signals and no type has the incentive to deviate. Thus, when creating the mechanism, the related figures actually have to be chosen in such a way that the signaling is reliable and adjusts the payoffs properly. In our MBA example a one-year MBA would not suffice, since a bad worker would do the MBA and receive a net salary of 30 dollars, instead of the 20 dollars he would receive otherwise. On the other hand, a one-year MBA could be made a lot harder and work-intensive, so that the per-year costs are increased and, again, only the good workers go for the education.

Signaling in other areas of life and work

Signaling is not only useful and used in employee-employer relationships. For example, buyer-seller interactions might also require, or at least benefit from, signaling.

For example in the case of used cars information imbalance between the buyer and the seller can lead to all goods in the market being of poor quality. In such a case, the potential sellers of higher quality products would have to be able to reliably signal this quality to the potential buyers. The buyers do not have to signal their preferences, since a buyer looking for higher quality products will buy one, if he gets more value from it, and nobody will pay for a good more than the value received from buying and possessing the good. Thus, we do not have the potential problem of customers looking for low quality products suddenly hoarding all the high quality products.

Another interesting realm where signaling can be applied is the one of procurement. At least larger companies often have a centralized procurement function that is responsible for managing the main suppliers, conducting supplier selection and awarding contracts. When looking for a supplier, the procurement function has to create competition between the candidates to find out the best one for the given quality and specifications. However, a potential supplier must invest resources in the bidding process without any certain revenue or supply contract. If the potential supplier thinks that the expected payoff from the contract is low, he will not put too much effort in to the bidding. This might lead to the customer company receiving only a few offers or the offers being of poor quality and difficult to compare against one another. To increase the number and quality of the offers, the customer company would have to signal, in a reliable way, that the potential contract has a guaranteed, maybe even high value. The way to do this is to commit, before the bidding starts, to awarding the contract to one bidder and to one bidder only, and beforehand abstaining from any cherry picking between offers. This of course requires that a large enough pool of bidders is invited and that the background research on them has been done, since procurement will have to commit to one of the invited bidders. Therefore, their capabilities have to match the requirements well enough, so that in principle any solution could be accepted.

As we see, tying your hands or incurring costs to show your commitment and capability are some ways to reliably signal who you are. Consequently signaling can help you leverage those advantages that might otherwise go unnoticed. Information is power, and shared information can be overpower.

Incomplete and complete information – Strengthen your advantage by revealing it

I have nearly finished the Open Yale course on game theory. Just the final exam is still to be done. Although my exam won’t be graded, I’ll use the opportunity to tackle those problems with the new tools and mind-set and see, what I have learned and understood.

The last topics in the course were about incomplete information and signaling. In game theory incomplete information refers to games, where at least one player is uncertain of the other players’ so called types. A type describes a player’s preferences, available strategies and payoffs. For example, in a real life negotiation we often have incomplete information, being uncertain how tough the opponent is and how he values the potential outcomes. A related concept, imperfect information, we have already encountered when a player does not know, which strategy his opponent has played, although he knows all the potential ways the game could be played and the related preferences and payoffs.

Incomplete information is a realistic condition and thus often encountered when modeling real world phenomena. Often incomplete information is also seen as an asset from the informed party’s point of view: for example, a company knowing both its own and its competitor’s cost structure is intuitively thought to have an advantage, since it knows more, and information is power, as the cliché goes. But what exactly is the benefit of knowing your competitors’ costs and how valuable is it? A further question, with an even less intuitive answer, is, whether such a better-informed company would actually profit from making its own cost structure public. In the following, I will answer these questions from the perspective of the Cournot duopoly model, but taking it a bit further from the standard treatment.

The following is based on the Open Yale course Econ 159a Game Theory and on Strategies and Games: Theory and Practice, 1999 by Prajit K. Dutta.

Incomplete information in a Cournot duopoly

In the standard Cournot duopoly two profit-maximizing companies manufacture substitute products with identical constant marginal cost and serve the same market. The outcome of the model is that in equilibrium both companies produce the same amount of goods, in total more than the monopoly quantity but less than the quantity in free-market competition. Consequently, the companies also get higher than free-market prices and profits, but not the monopoly prices or profits due to the competition between the two rivals.

In an incomplete information Cournot duopoly at least one company (I will only consider this case in the following) has its marginal cost different from those of its competitor’s and this exact cost is unknown to the competitor, while the competitors marginal cost is known by both parties. On average, the company with non-public marginal cost has the same marginal cost as its competitor and the competitor knows this. In equilibrium, the company with the unpublicized marginal cost produces more, reducing the market price, and gets higher profits when its marginal cost is lower than the average. If its marginal cost is above the average, its produced quantity, the price and its profit move to the opposite directions. Somewhat surprisingly, if the company can without costs and credibly reveal its lower marginal cost to the competitor, it will benefit even more, while a company with above average marginal cost would suffer even more.

In table 1 I have summarized the produced quantities, prices and profits to show that revealing its lower than average cost structure in a Cournot duopoly is indeed beneficial for a company. The intuition is the following. When the low-cost producer (company 2) does not reveal its marginal costs, the competitor (company 1) reacts based on the average marginal cost, producing the standard Cournot-quantity. This is the logical reaction, since on average company 2 has the same marginal cost as company 1. However, company knows its own cost structure and produces more than the standard Cournot-quantity, since this is profitable due to the lower than average marginal cost. Likewise, a high-cost company 2 would produce less than the standard Cournot-quantity, but would not suffer from its competitor’s producing more than its standard Cournot-quantity, since again company 1 is reacting as if company 2 had the average marginal cost.

If company 2 could credibly make its lower than average marginal cost public, company 1 would know that company 2 can and will produce more due to its lower marginal cost, driving the price down. In this case company 1 would react to the actual lower than average marginal cost of company 2, not on company 2’s expected average marginal cost. Consequently, company 1 will produce less to counter the price erosion and maximize its margins in this situation. Also, knowing the reaction of company 1, company will produce more, making also higher profits than in the case of incomplete information. Correspondingly, a high-cost producer will also suffer more if its cost structure becomes public, so it will try to keep this information secret.

Table 1 summarizes the Cournot-quantities, prices and profits in different cases of a Cournot duopoly, with companies 1 and 2, company 2 being the low- / high-cost producer who always knows company 1’s constant marginal cost. In the table I have used the following notation:

  • P = a – bQ > 0, where P is the unit price a and b are non-negative constants
  • Q = Q1 + Q2, where Q1 and Q2 are the quantities produced
  • c is the average marginal cost
  • ε is the difference of the low-/high-cost company’s from the average marginal cost
  • c + ε > 0
  • a low-cost company 2 has ε < 0, a high-cost company 2 has ε > 0

I have also used P’, P’’ and similar expressions for the quantities to separate the cases of standard Cournot duopoly from those of incomplete and complete information with company 2 having lower or higher than average marginal cost.


From table 1 it becomes clear that profits for company 2 increase (decrease) for negative (positive) values of ε, when we move from the standard Cournot duopoly to a duopoly with different marginal costs between companies with incomplete and finally complete information. Thus, a cost advantage is strictly profitable and making it public increases the profit. The profits of company 1 decrease (increase) for negative (positive) values of ε,  when moving from the standard Cournot duopoly to a duopoly with different marginal costs between companies with incomplete information. Thus, a cost disadvantage of one company is strictly profitable for the other companies and their profits increase with this cost disadvantage. With some algebra, it can be shown that company 1’s profits in the case of complete information are lower than in the case of incomplete information and that the difference is (ε/6)*(a + c + 2ε), which is negative (positive) for negative (positive) values of ε.

Information cascading and revelation

As argued previously, a low-cost producer in a Cournot duopoly has the incentive to make its cost structure public to maximize its benefits, i.e. profits. More precisely, the producer with the lowest marginal cost has this incentive, since non-disclosure would lead the competition treating the company as an average-cost producer. Therefore, the producer with the lowest marginal cost will reveal its costs. Now, if there are more competitors in the market, the producer with the second lowest marginal cost also has the incentive to reveal its costs, although they are higher than those of the cost leader. If the producer with the second lowest marginal cost did not reveal its costs, it would now be treated as an average producer in the remaining group of companies: the lowest cost producer has now been excluded from the average, since its cost structure is known. But being treated as an average-cost producer is clearly sub-optimal, if a company’s marginal cost is below the average. Thus, the producer with the second lowest marginal cost will also reveal its costs. This logic can be followed right to the last company on the market, to the one with the highest marginal cost.

When one company, the one with the lowest marginal cost, reveals its costs, this leads to information cascading, since the companies with the next lowest costs now want to differentiate from competitors with higher marginal cost, even if they have costs above the average over all companies at the market. This makes the marginal costs of all companies public, or at least their relation to one another. The last company is evidently going to be the one with the highest costs, since it wishes to stay hidden among the masses and has thus not yet revealed its costs. It follows that the last company does not have to reveal its costs; the competition will be able to infer, that the last company has the highest marginal cost and will react accordingly, even if not knowing the exact costs.

The dog didn’t bark – the value of undisclosed information

The previous exercise shows an important, broader real-world application of revealing information. The absence of evidence or the absence of information can be a substantial piece of information. If a company does not want to reveal its cost structure, it is likely due to its high (marginal) costs, at least in the context of the Cournot duopoly model. But not all markets are like the Cournot model, so the implications are not completely generalizable. But still, even in a free market where all companies are price takers, revealing your costs might help you; not in gaining market share, since your produced quantities do not affect the prices, but in keeping competitors from entering price wars. By revealing your costs, you can potentially indicate that you have the largest margin and can thus come out on top, should a competitor start a price war. But by revealing your costs, you may be able to keep your opponents at bay, since they know in advance that their chances of winning a price war are slim.

The difficulty in revealing the costs and gaining the associated advantage is that it is not always evident, which company has the lowest costs. Therefore, companies may restrain from revealing their costs, even if they would benefit from their low costs more when if became public. But without knowing that it is a low-cost producer a company risks becoming disadvantaged, should it in fact be a high-cost producer.

In the case of very similar consumables and bulk goods (e.g. oil, agriculture products) good estimates on the relative costs among competitors may be made and thus own cost advantages can be made public to keep the competition from starting a price war. For highly specialized and small-series products with small markets, revealing a cost advantage would give the power to affect the demand, prices and quantities produced, but finding out who is the low-cost producer might be more difficult. Specialty products often enjoy higher margins, so that product prices are poor indicators of true costs.

In conclusion, revealing information to your rivals may give you an advantage, and not revealing information already conveys information about your situation. Furthermore, even if you are not in the best position in a group of competitors, being able to separate yourself from the even weaker ones may give you an advantage, and to do this you must credibly signal that you have an advantage over some of the competitors. This will then lead to all with a relative advantage to revealing this information.

Signaling is yet another topic in game theory and discusses how different types of players, e.g. good and bad workers, can credibly be identified: an employer has an incentive to get the good and, presumably, more productive good workers, and the good workers have the incentive to reveal themselves in hopes of a higher salary. But the bad workers also have an incentive to be taken for good workers, due to the potential higher salary. Therefore, if the higher workers are to earn more and employers are to pay “correct” wages to the two types of workers, we need a credible signal that the good worker can and will give, but the bad worker cannot or does not want to give. I will return to this topic in my next post.

Learning how to teach

At work we have weekly training sessions. Each Thursday one of our team members takes one hour to teach a topic to the rest of the team. The themes range from hands-on use of Excel templates over negotiation skills to creating a business case. The respective theme depends on our annual goals, i.e. what we have agreed to learn, and on the interests of the trainers.

About a week ago I gave our team a training on the Monty Hall problem. The problem is easy to understand, yet the correct answer is somewhat unintuitive and explaining it in an understandable way takes some effort. It also shows well how easy it is to get probabilities wrong. Therefore I had the Monty Hall problem as the day’s topic.

I first presented the problem, and to make it more interesting I asked for volunteers to play against me. This way I wanted to get some statistical data to back up my upcoming explanation of the probabilities in the game. Before starting I presented two questions that we would answer during the training:

  1. Is it profitable to play the game, when the participation costs 0.70 and the potential gain is 1.00?
  2. Which strategy maximizes the expected winnings? And why?

Here I must mention that I introduced the a version of the game, where there is always a prize behind exactly one door, the prize is never removed and the hosts always opens exactly one empty door at random, but never the player’s chosen door.

Before we started playing, one team member asked me for the goal of the training. I replied that the goals were to:

  1. Understand the Monty Hall problem and the related probabilities in general.
  2. To see that probabilities are not always intuitive and thus we should beware when making decision, even if we think we have calculated the risks correctly.

I ended up playing the game with one colleague for ten rounds, so we did not get any statistically meaningful results, but these games already gave us some feeling of the game, how it works and how it feels to be in the situation, having to choose between the doors. After the game I explained the probabilities involved by drawing on a paper the three potential outcomes in a single game and arguing that by changing the initial choice the chances of winning are maximized. This way it was quite easy to convincingly illustrate that sticking to the original door only gives a one in three chance to win.

I think I wan’t able to convince all team members of the importance of understanding the Monty Hall problem or the importance of learning new things outside of your daily business. An in all I was still pleased with how I managed to explain the problem and its essence in a practical and tangible way.


The following weekend after the training, when I was again on one of my many walls in the woods, four things dawned to me regarding the lessons given by the Monty Hall problem:

  1. When presenting the Monty Hall problem, it should be thought of as not just as a game of chance, but as a representation of an action containing risk. The game can be thought of as as investment, where the costs and the payoffs determine, whether the investment is profitable.
  2. Even more importantly, the problem teaches us to concentrate on the essential. The game has two stages: In the first stage the player chooses a door and the host opens an empty door. In the second stage the player may change his initial choice. The game can be represented as a one stage game, where the player only makes a decision between choosing one door or choosing two doors. Due to the rules of the game, the host’s opening one empty door is just trickery used to distract the player. If we omit the door opening and give the player the option of choosing either one door or two doors, the intrinsic probabilities of winning become obvious. Thus, it is necessary to see what is relevant.
  3. Based on point 2, when teaching or learning, we should try to present the topic or problem from as many aspects as possible. That way it is easier for more people to learn the subject matter. The multiple representations also enable us to recognize a potentially familiar pattern or situation in a completely different context. This way we can apply the different tools and models we learn. E.g. we might, when presented with an offer, see some similarities to the Monty Hall problem and thus know to be careful in making our choice and are also equipped to analyze the offer correctly.
  4. When teaching, we should emphasize the importance of learning different things and models just to understand how the world might look like. As an analogy, a person with only a hammer is less useful at a construction yard than a person who also has a saw, a grinder, a pen and some paper in his pocket and a crowbar. When we have multiple tools and know how and when to use them, we are better able to act in different situations. With multiple models we are also better equipped to recognize the important things and act accordingly. Sometimes learning something does not provide obvious or immediate benefits, but might later on in life prove to be very valuable.