probability – Ben Lowery @ STOR-i

The Monty Hall problem and its generalisations: Part 2

ben-lowery — Fri, 01 Apr 2022 13:01:44 +0000

In the previous blog post we looked at the infamous Monty Hall problem and its controversial (but correct) solution. The main problem has been talk of countless , , and ; providing a nice introduction to probability and Bayes theorem. And while it is fun to rehash the same story, it might be worth looking at how to broaden the problem and see how the same core principles can be applied to a more obtuse setting. With this we can explore some of these re-formulations, starting with expanding the number of doors in our game.

Monty Hall: Live from the !

Let’s envision the following fictitious scenario; in which after the success of his three door final showdown, Monty and his team have been gifted a bigger budget to make a more elaborate show, and a new, possibly infinitely big studio. Here, Monty utilises this increased budget to order his producers to purchase more doors. Now that he possesses a studio filled to the brim of disused doors, Monty again places one car behind a door and keeps count of where it is placed. While a flood of goats trundle in and hide behind the rest, he asks a contestant, now slightly more intimidated than their predecessors (see below gif), to pick a door. The contestant hesitantly chooses, giving way to our host opening every door but the contestants and one final door.

A contestant on the reboot of let’s make a deal who’s confidence clearly indicates that they read the last blog post on how to win.

Now we again pose the question, stick or switch? Given the information we attained from the original Monty Hall situation, it makes sense to have an intuitive guess that switching will be in the contestants best interest. And we can test this again by using Bayes’ Theorem and some basic Probability theory.

Like in the original incarnation, we define events and variables for the new version. Instead of 3 doors, we now possess $d$ doors. Each of these doors are assigned the event it may possess the prize behind it, we can define these events as $D_1,...,D_d$ . While also allowing $G$ to be the event we open all but doors 1 and $d$ to reveal a goat. So with this in mind, we can formulate the following Bayes equation for any door in particular, say $i$ :

$\mathbb{P}[D_i|G]=\frac{\mathbb{P}[D_i]\mathbb{P}[G|D_i]}{\mathbb{P}[G]}.$

The individual probabilities for the right hand side are calculated as:

$\mathbb{P}[D_1]=...=\mathbb{P}[D_d]=1/d \\ \mathbb{P}[G|D_1]=\frac{1}{d-1} \\ \textrm{(As we are just restricted to opening every other door if the prize is here)} \\ \mathbb{P}[G|D_d]=1 \\ \textrm{(As we are just restricted to opening every other door if the prize is here)} \\ \mathbb{P}[G|D_2]=...=\mathbb{P}[G|D_{d-1}]=0 \\ \textrm{(If the prize lies in all these doors we want to open, we clearly can’t open them).}$

Since $D_1$ to $D_d$ cannot occur simultaneously, then we can use some more simple statistical properties, specifically , to express the probability of a goat behind all but doors 1 and $d$ as follows in this slightly long, but hopefully intuitive derivation:

$\mathbb{P}[G]=\sum_{i=1}^d \mathbb{P}[D_i]\mathbb{P}[G|D_i]\\= \mathbb{P}[D_1]\mathbb{P}[G|D_1]+ \mathbb{P}[D_d]\mathbb{P}[G|D_d]+\sum_{i=2}^{d-1}\mathbb{P}[D_i]\mathbb{P}[G|D_i]\\ =\frac{1}{d}\cdot \frac{1}{d-1}+\frac{1}{d}\cdot 1=\frac{1}{d}\left(\frac{1}{d-1}+1\right)=\frac{1}{d}\left(\frac{d}{d-1}\right).$

Remember we opened all but door 1 and $d$ , so all doors in-between will have a probability 0 of having the car behind it. Thus, we substitute the above derivations back into Bayes Theorem equations, but only for doors 1 and $d$ are,

$\mathbb{P}[D_1|G]=\frac{1/d\cdot (1/d-1)}{1/d\cdot d/(d-1)}=\frac{1}{d} \\ \mathbb{P}[D_d|G]=\frac{1/d\cdot 1}{1/d\cdot d/(d-1)}=\frac{d-1}{d}$

While tedious, this derivation is pivotal in allowing a generalisation of the problem. Generalisations are crucial in mathematics, allowing us to expand our problem from an initial set of constrained numbers (like only 3 available doors) to as many doors as we want, and all we need to do is plug that number into $d$ .

To test this – and provide a little sanity check – we can hark back to our original Monty Hall Problem and seeing that substituting 3 doors gives us probabilities of switching and staying as 2/3 and 1/3 respectively. So now we have a rather straightforward method to show, no matter how many doors we have, if we open all but one door and the original door, it is in our best interest to switch. In addition to this, although trivial to point out, we see that with more doors, our likelihood of winning when switching increases. For example, with $d=7$ doors, we should, by plugging our values in, attain the staying probability (i.e. door 1) as 1/7, and switching (to door $d$ ) as 6/7.

To see this in practice let’s run some simulations for 3, 5, 7 and 9 doors after carrying out Monty Hall’s deal 3000 times.

We see the winning chance from switching increases as the doors increase

which is pretty good.

Winning isn’t everything

Our second generalisation into Monty Hall’s problem is one in which we are looking to try flip the odds back into the favour of the host. Given the generosity of winning when we expand the problem to $d$ doors, it is now worth seeing if limiting the number of doors Monty opens can make it more – or less – likely for the contestant to win when switching. If we think about this in a logical manner, it should be the case that now we have the option of which door we can switch to, we are less likely to get the prize than in the previous scenario if we switch. But it is worth calculating how much of a detriment this new rule is to our contestant. And analysing how drastically our odds can change by opening less and less doors.

This time we will take a scheduled commercial break ~~out of laziness~~ to relieve ourselves of any more probability equations and focus solely on numerical computations. Consider an example where, given $d$ doors, we open $k$ of these. More specifically let’s look at the case of having 10 doors and we open a subset of these, analysing the number of times we win if we switch, we win if we stay, and the new third case that we don’t win if we stayed or switched. This is seen in the following graph.

As we can see in the choice between 10 doors, when opening 8 (which is the max number of doors we can open) and opening 6 (in which we have a choice as to what we can switch to), there exists a significant dip in the probability of winning when switching, with it then being more likely to not win the game whatever we do. This is due to the truly random choice we now have with the selection of the door we might want to switch to. Despite all these changes, the chance of switching consistently gives better odds than staying. We can see this in the generalisation of opening $k$ from a set of $d$ doors. This is given as the following equation:

$\mathbb{P}[\textrm{Winning when switching}]=\left(\frac{d-1}{d}\right)\cdot\left(\frac{1}{d-k-1}\right)$

For those in dire need of a Bayesian derivation (as I normally am), one can refer . As a little test, we can take $d=10$ and $k=2$ and see that the probability of winning when switching as $\approx 0.129$ , which leaves our simulation pretty damn close to what we want.

And with this we can conclude our investigation into suspected goat farmer Monty Hall and his mystery doors. But was this investigation as concrete as the numbers suggest?

Statistical stage fright

In these two blogs we’ve seen how applying some mathematical rigour allows us to understand, dissect and create advantages in a game of seemingly random luck. With this being said, as often is the case of applying Mathematics to the real world, our logical reasoning may still not be perfect, nor reveal the true solution to the problem. since arguments could be made in that randomising the choices of contestants in the simulation and using conditional probability detracts from the human element in the game. That in which the host, the atmosphere and the audience play a crucial role in the dilemma posed to the contestant, perhaps leading to a bias in the options available. This is something that probability and random simulations simply cannot account for. Hence it could even be contested that in reality, based on the host’s hints and approach towards the contestant, the probability of finding the winning car can range from 1/2 to 1. A paper on this dilemma of the human element can be see .

Source

Von-Neumann, risk, and birds: How to model rational behaviour

ben-lowery — Sun, 06 Feb 2022 21:48:27 +0000

The idea of expected utility is one that attempts to mathematically model the choices individuals make under uncertainty. These are often complex situations, with a key parameter being the risk appetite for the situation. Here, the term “risk appetite” pertains to the level of risk one is willing to take in order to obtain an objective and these could range from something as fundamental as getting enough food for survival, to trying to win big in the lottery. Expected utility settles itself within a wider cacophony of approaches towards decision-making under risk, and is arguably the most prominent and followed noise.

As early motivation, consider a decision maker who is faced with with a set of outcomes, each with their own associated risk. Under expected utility, the main assumption is that the decision maker will aim to maximise the expected value over all possible outcomes. The function that contains all possible outcomes is referred to as a utility function and is the fundamental concept when trying to explore expected utility, especially from a mathematical perspective.

It can be hard figuring out the right path to take.

This blog aims to follow rational decision making from a mathematical viewpoint, starting from its formalised origins in the 20th century, although the ideas date back to the . With this, the four essential components to what makes a decision maker rational is presented; this is then followed by a primer on utility functions based on a school of thought known as Von Neumann-Morgernstern theory. To bring home understanding, we fly through an example that applies the theory to a problem involving the utility of food. Following this is an extension and consideration of shifting rational behaviours and a wrap up with some further theory and practical applications.

Rationality through a mathematical scope

In its mathematical sphere of interpretation, the theory stemmed from work by prominent economist Oskar Morgenstern (also attached to this project was some little known ). Explored in the opening of Section 3.2 in Ken Binmore’s fantastic book, ““, it was said Morgenstern had approached von Neumann to help formalise ideas in another field known as Game Theory. Together they developed and released the “Theory of Games and Economic Behaviour” in 1944 which packaged and cemented some neat initial results in game theory.

As a precursor to all the mathematics presented, Morgenstern was irrationally persistent in wanting the exhilaratingly high octane issue of cardinal utilities, which had been used fruitfully throughout the book, well defined and given as a basis for which their game theory ideas built upon. So as one does, Von Neumann invented a theory on the spot which measures a person’s preference to something based on the risk they are willing to take to attain it. Thus, we arrive at von Neumann-Morgenstern Theory (or VNM Theory from here forth).

Their ideas can be cemented through a set of four postulates. Under a mathematical context, postulates can be thought of as statements that are accepted as definitively true without having necessarily to prove it as such. Von Neumann and Morgenstern didn’t particularly provide the most approachable explanation of these postulates, however they can be summarised in laymen’s terms as follows.

Postulate 1.
(Completeness) There is a well defined set of preferences and the individual can clearly
decide between two alternatives.

Postulate 2.
(Transitivity) Follows from the completeness criterion and states preferences are chosen
consistently

Postulate 3.
(Independence) An individual does not care about a new independent outcome if they are
indifferent about itself and the one it is replacing.

Postulate 4.
(Continuity) Small changes in outcomes only lead to small changes in preference.

These four postulates when satisfied constitute what is believed to be a rational decision maker. And from this, their preferences can be modelled by what is known as a utility function. But what is this?

Navigating the rabbit hole of utility functions

A key subset of the theory, utility functions, models a preference relation between utility and some
measurable quantity (wealth, food, etc.), the incorrect assumption of a utility function is something that
often occurs and leads many to come up with incorrect, and sometimes nonsensical musings on rational behaviour. A good example not explored here is the , which can help the reader understand what occurs when incorrect rational behaviour is assumed. Utility functions are essentially ways in which we can express preference as a set of ordered rankings in which we assign a utility metric known as utils to each. With this, we aim to find the expected value and to subsequently assess and explain risk preferences for individuals.

Utility functions are required to have satisfied the four axioms from VNM Theory, and can be
utilised to measure and model risk appetite. Under VNM theory, these attitudes to risk are consistent throughout, with three categories being risk-averse, risk-neutral, and risk-seeking behaviours. The first
of these is represented by a function that will slow down over time (formally defined as a concave function) and provides a preference for low variance outcomes. A risk-neutral individual is indifferent towards an outcome and can be represented by a function of just a straight line (affine). Finally, risk-seeking individuals will have more utility in achieving a higher level of outcome, therefore their function increases quicker and quicker over time (convex function). These functions are expressed in terms of $U(x)$ , which maps some quantity $x$ such as food or wealth, to utils, an aforementioned unit for utility.

Utility functions and examples of how they can be modelled using standard, well known, functions.

A very important addendum to all this is that these functions, and the values they produce, cannot be compared like regular numbers. Utility under this setting requires that if one value of utility is higher than another, it is just objectively preferred and not preferred so many times more than another. Take for example: if for events $x_1$ and $x_2, U(x_1) = 42$ and $U(x_2) = 3$ , $x_1$ is not 14 times more preferred than $x_2.$

The Birds!

We can consider a quick example to see how different risk appetites arise, and how both a fixed utility function, or a variable utility function can be applied to the situation. To do this we can consider the choice a bird makes in the face of hunger.

To get rid of the trivial idea you may have that it’s not particularly feasible for a bird to be making rational human decisions. Consider some kafkaesque-light bird with some human psyche engrained within its internal cognitive function. I.e. This is not your senseless ; this is a rational bird, making rational decisions.

Now, this bird is fairly hungry, and needs some food to scavenge for, say $x$ amount over an arbitrary amount of time. The bird has the choice as to whether go beyond what they usually are able to gather and risk dying in the process (obviously returning with nothing as they’re dead) or settle for an amount they know is safe and can be collected rather quickly.

The rational choices this bird can make, and the utility it attains from such, can be partitioned into a few scenarios:

Scenario 1: (the bird is starving): In this scenario, a risk-seeking approach would need to be made to attain enough food for survival. To make sure this is reflected in their utility function, it would be required to see the bird value the expected utility from the amount of food it finds over the average amount of food it’s expected to find, and thus forage for food even if it’s more than should be expected.

Scenario 2: (The bird likely has the means to attain ample amounts of food under both scenarios):
Here, they are indifferent to the amount of food they’re expected to get, and the utility it attains from this. Therefore, it does not value taking the risk and potentially gaining a greater payoff, than it does going for a safe amount.

Scenario 3: (The bird likely only needs a small amount of extra food): Finally, here the bird only requires a small amount of extra food and is willing to take a safe payoff for this, even if it’s less than the amount they usually expect to get.

Now, the first time I grappled with the avian madness and the overarching concepts as a whole, i found myself descending into a tangle to fully understand how this represents our decision making under risky situations. With scenes at the time of my brain trying to explain this to me roughly amounting to this:

Therefore, I often found it best to run through some numbers and to check the scenarios descriptions against the graphs of the different utility functions we saw earlier. So let us do just that:

Evaluating Scenario 3:
Here, the bird has enough food to survive, thus it is said to be risk-averse as explained in the scenario setting. From this, it’s utility function for the food quantity $x$ could be something like $ln(x)$ (), which is just the example function given in the earlier graphs of the previous section. From the worded description of the scenario, if we was to think of this in terms of expected value of the amount of food we expect to attain, what we are saying is that the utility of the expected value should be worth more than the expected utility. Specifically the amount of food we are expected to attain is a safe bet for us to take.

You can see how this is reflected in the chosen utility function for risk-averse behaviour by plugging in some values. Say for example we have an equal chance of attaining 2 amount or 7 amounts of food (whether that be kg, grams or some other metric is irrelevant here). Then the utility for the expected amount is just the mean of these numbers plugged into the utility function, or,

$U\left(\frac{7+2}{2}\right)=U(9/2)=\ln\left(\frac{9}{2}\right).$

The expected value of the utility is analogously calculated as:

$\mathbb{E}\left[U(7)+U(2)\right]=\frac{\ln(7)+\ln(2)}{2}=\frac{\ln(14)}{2}$

Using a calculator, it should be found that $\ln(9/2)\geq \ln(14)/2$ , so taking the safe option of the expected amount has more utility to us in a risk-averse scenario, this is what was described earlier. Checks can also be done for the first two scenarios, but here we expect to value the expected value of the utility more and equally respectively.

Conclusion

This is all well and good modelling the Bird having a strict concept of rational behaviour and having a consistent approach to the matter. But what if the situation changes and evolves as we attain more food? What if the bird no longer the feels to value food as much as it once did? Or what if disaster strikes and food suddenly becomes a luxurious scarcity. The suspense is palpable!

We can answer such intriguing questions in a future blog post!

Source