# Explaining Away: 'Who's In The Bathroom ?'

# Explaining Away: 'Who's In The Bathroom ?'

## Keywords

**Collider Pattern, Explaining Away,** Logical OR and XOR, Priors, Directed Acyclic Graph (DAG), Generative Probabilistic Model, Joint Probability Mass Function (JPMF), Conditional Probability Mass Function (JPMF), Bayesian Inference, Inverse Probability, Bernoulli Random Variable, Bernoulli Distribution, WebPPL, Infer(...), flip(...)

## Scenario, Probabilistic Graphical Model, and Mathematical Specification

**1. Background and Motivation**

We model OR or XOR relationships hidden in a mini detective story with a collider pattern embedded within a probabilistic generative graphical model. Evidence-based explaining-away is realized by Bayesian inference. In this way, simple conclusions based on the pattern "Who is where when I know this?" are possible.

**2. Scenario**

The scenario is described in example 1.4 in Barber (2012, 3/e, p.10f; 2017) as follows: *Consider a household of three people, Alice, Bob, and Cecil. Cecil wants to go to the bathroom but finds it occupied. He then goes to Alice's room and sees she is there. Since Cecil knows that only either Alice or Bob can be in the bathroom, from this he infers that Bob must be in the bathroom. *Barber finds the scenario remarkable, because ... *This example is interesting since we are not required to make a full probabilistik model in this case thanks to the limiting nature of the probabilities (we don't need to specify P(A, B). The situation is common in limiting situatations of probabilities being either 0 or 1, corresponding to traditional logic systems *(2012, 3/e, p.11; 2017).

**3. Modelling Approach**

Deviating from Barber's approach, we specify a fully generative probabilistic graphical model. There are several reasons for this. On the one hand, the probability parameters in the model can be modified further and further to make the model more realistic for everyday situations such as in a flat-sharing community. On the other hand, a probabilistic model is able to map the increase in security in the assessment of the situation with every new relevant information.

**3.1 Modelling Step: DAG of Generative Probabilistic Model**

First, we are developing a model based largely on Barber's original scenario. The occupancy of the bathroom depends on whether there are none, one or more of the three people in the room. The use of the room depends on the behaviour of the three people. This is modelled by three independent Bernoulli variables. In Barber's example, the values of the Bernoulli variables are 'X is in her bedroom' / 'X is not in her bedroom'. We have changed the values in our model to the rather rough variant 'X is in the bathroom' /'X is not in the bathroom'. We had problems with Barber's value 'X is not in her bedroom'. In this case you can't be sure where the bathroom is. Rather, there is considerable uncertainty about the location of the person. It can be in the bathroom, another person's bedroom, kitchen or living room.

We model the causal structure of the scenario as a simple OR gate with the collider pattern (Pearl et al., 2016; Fig. 1-5).

Then we set a limit to the number of people at the same time in the bathroom. Only one person or no one at all is allowed. If two people want to enter the room at the same time, they have to go back to the living room and solve their conflict. During this time the bathroom is empty. This second scenario is a modification of Barber's scenario. It has to be modeled with an *XOR gate*, which is also embedded in a *collider *(Fig. 6 - 10).

It is assumed that the behavior of the persons is independent of each other and does not depend on external influences. So no arrow goes into person nodes and there are no arrows between them.

**3.2 Modelling Step: Decoration of DAG with Numerical Parameters**

Since we have no assumptions about people's preference for the bathroom, we assume that the probability of a personal preference for this room is 1/2 for each person. This seems to be too high for normal shared flats. More realistic are probably the probabilities between 15-25% for each person per day.

In the conditional probability table $$P(BathOccupied | AliceIsInBath, BobIsInBath, CecilIsInBath) $$ we enter the conditional probabilities which assume the extreme values 0 (= false) or 1 (= true). There are 8 conditional probability distributions $$P(BaOcc | ......)$$. This number results from the size of the combined sample space $$|AlInB \times BoInB \times CeInB| = 2^3 = 8$$. The corresponding column sums add up to 1.

**3.3 Modelling Step: Mathematical Specification**

First we specify the model as OR gate collider (Fig. 1-5). Thus the model stands for a somewhat strange community without privacy, as it allows 0, 1, 2, 3 people in the bathroom at the same time. Then we specify an alternative model as XOR-gate collider (Fig. 6-10). Now only one person at most is allowed.

**3.3.1 Joint Probability Mass Function (JPMF) P(Ba, Al, Bo, Ce)**

The JPMF is (Fig. 2) $$P(Ba, Al, Bo, Ce) = P(Ba |Al, Bo, Ce) \cdot P(Al) \cdot P(Bo) \cdot P(Ce) $$

**3.3.2 OR-Gate Collider**

**3.3.2.1 Prior beliefs about the Marginal PMF P(Ba)**

The marginal probability of the bathroom occupation time *P(Ba)* is (Fig. 2)

$$P(Ba) = \sum_{Al} \sum_{Bo} \sum_{Ce} P(Ba, Al, Bo, Ce) = \sum_{Al} \sum_{Bo} \sum_{Ce} P(Ba |Al, Bo, Ce) \cdot P(Al) \cdot P(Bo) \cdot P(Ce) $$

$$ = \sum_{Al} P(Al) \sum_{Bo} P(Bo) \sum_{Ce} P(Ce) \cdot P(Ba |Al, Bo, Ce) = \left(\begin{array}{r} \frac{7}{8} \\ \frac{1}{8} \end{array}\right) $$

This means that the bathroom is nearly all times occupied because

$$P(Ba) = \left(\begin{array}{r} \frac{7}{8} \\ \frac{1}{8} \end{array}\right) = \left(\begin{array}{r} 0.875 \\ 0.125 \end{array}\right) $$.

**3.3.2.2 Inferences P(Hypothesis|Evidence)**

Now Barber's story is being modified by the introduction of a guest. He likes to visit because they have old vinyl records and a record player. When he hears the old records loudly in the living room, everyone moves into their rooms and closes the doors so that nobody can be seen. At some point he wants to go to the toilet at the end of the floor. He/she has to go down the hall. When he arrives at the bathroom, he notices that this door is also locked. Who could be behind the door?

We can approach the problem solution via one *multivariate* and three *univariate marginal conditional* queries (Fig. 3).

The *conditional multivariate* query is $$P(Al, Bo, Ce | Ba = true) = ? $$

and the *conditional univariate* ones:

$$P(Al | Ba = true) = \sum_{Bo, Ce} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{4}{7} \\ \frac{3}{7} \end{array}\right) = \left(\begin{array}{r} 0.571 \\ 0.429 \end{array}\right) $$

$$P(Bo | Ba = true) = \sum_{Al, Ce} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{4}{7} \\ \frac{3}{7} \end{array}\right) = \left(\begin{array}{r} 0.571 \\ 0.429 \end{array}\right) $$

$$P(Ce | Ba = true) = \sum_{Al, Bo} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{4}{7} \\ \frac{3}{7} \end{array}\right) = \left(\begin{array}{r} 0.571 \\ 0.429 \end{array}\right) $$

This means, that the evidence *bathroom door* closed increases the prior probabilities from 1/2 (= 0.500) to 4/7 (= 0.571). The posterior probability 4/7 is equal for all three persons so it is not discriminative.

Now,* Cecil wants to go to the bathroom but finds it occupied. *

This time we have one *multivariate* and two *univariate marginal* conditional queries (Fig. 4):

$$P(Al, Bo | Ba = true, Ce = false) = ? $$

and

$$P(Al | Ba = true, Ce = false) = \sum_{Bo} P(Al, Bo | Ba = true, Ce = false) = \left(\begin{array}{r} \frac{2}{3} \\ \frac{1}{3} \end{array}\right) = \left(\begin{array}{r} 0.667 \\ 0.333 \end{array}\right) $$

$$P(Bo | Ba = true, Ce = false) = \sum_{Al} P(Al, Bo | Ba = true, Ce = false) = \left(\begin{array}{r} \frac{2}{3} \\ \frac{1}{3} \end{array}\right) = \left(\begin{array}{r} 0.667 \\ 0.333 \end{array}\right) $$

This means, that the evidence *bathroom door* closed increases the prior probabilities from 4/7 (= 0.571) to 2/3 (= 0.667). The posterior probability 2/3 is equal for the two remaining persons Alice and Bob so it is also not discriminative.

*Cecil then goes to Alice's room and sees she is there. *

This time we have only* one* *univariate marginal* conditional query (Fig. 5):

$$P(Bo | Ba = true, Ce = false, Al = false) = \left(\begin{array}{r} \frac{1}{1} \\ \frac{0}{1} \end{array}\right) = \left(\begin{array}{r} 1.000 \\ 0.000 \end{array}\right) $$

**3.3.3 XOR-Gate Collider**

Now, we model the causal structure of the scenario as a simple XOR gate with the collider pattern (Pearl et al., 2016; Fig. 6-10).

We set a limit to the number of people at the same time in the bathroom . Only one person or no one at all is allowed. If two people want to enter the room at the same time, they have to go back to the living room and solve their conflict. During this time the bathroom is empty.

It is also assumed that the behavior of the persons is independent of each other and does not depend on external influences. So no arrow goes into person nodes and there are no arrows between them.

**3.3.3.1 Prior beliefs about the Marginal PMF P(Ba)**

The marginal probability of the bathroom occupation time *P(Ba)* is (Fig. 7)

$$P(Ba) = \sum_{Al} \sum_{Bo} \sum_{Ce} P(Ba, Al, Bo, Ce) = \sum_{Al} \sum_{Bo} \sum_{Ce} P(Ba |Al, Bo, Ce) \cdot P(Al) \cdot P(Bo) \cdot P(Ce) $$

$$ = \sum_{Al} P(Al) \sum_{Bo} P(Bo) \sum_{Ce} P(Ce) \cdot P(Ba |Al, Bo, Ce) = \left(\begin{array}{r} \frac{3}{8} \\ \frac{5}{8} \end{array}\right) $$

This means that the bathroom is only 37.5% occupied because

$$P(Ba) = \left(\begin{array}{r} \frac{3}{8} \\ \frac{5}{8} \end{array}\right) = \left(\begin{array}{r} 0.375 \\ 0.625 \end{array}\right) $$.

We can approach the problem solution via one *multivariate* and three *univariate marginal conditional *queries (Fig. 8).

The *conditional multivariate* query is $$P(Al, Bo, Ce | Ba = true) = ? $$

and the *conditional univariate* ones:

$$P(Al | Ba = true) = \sum_{Bo, Ce} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{1}{3} \\ \frac{2}{3} \end{array}\right) = \left(\begin{array}{r} 0.333 \\ 0.667 \end{array}\right) $$

$$P(Bo | Ba = true) = \sum_{Al, Ce} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{1}{3} \\ \frac{2}{3} \end{array}\right) = \left(\begin{array}{r} 0.333 \\ 0.667 \end{array}\right) $$

$$P(Ce | Ba = true) = \sum_{Al, Bo} P(Al, Bo, Ce | Ba = true) = \left(\begin{array}{r} \frac{1}{3} \\ \frac{2}{3} \end{array}\right) = \left(\begin{array}{r} 0.333 \\ 0.667 \end{array}\right) $$

This means, that the evidence *bathroom door* closed decreases the prior probabilities from 3/8 (= 0.375) to 1/3 (= 0.333). The posterior probability 1/3 is equal for all three persons so it is not discriminative.

Now,* Cecil wants to go to the bathroom but finds it occupied. *

This time we have one *multivariate* and two *univariate marginal* conditional queries (Fig. 9):

$$P(Al, Bo | Ba = true, Ce = false) = ? $$

and

$$P(Al | Ba = true, Ce = false) = \sum_{Bo} P(Al, Bo | Ba = true, Ce = false) = \left(\begin{array}{r} \frac{1}{2} \\ \frac{1}{2} \end{array}\right) = \left(\begin{array}{r} 0.500 \\ 0.500 \end{array}\right) $$

$$P(Bo | Ba = true, Ce = false) = \sum_{Al} P(Al, Bo | Ba = true, Ce = false) = \left(\begin{array}{r} \frac{1}{2} \\ \frac{1}{2} \end{array}\right) = \left(\begin{array}{r} 0.500 \\ 0.500 \end{array}\right)$$

This means, that the evidence *bathroom door* closed increases the prior probabilities from 1/3 (= 0.333) to 1/2 (= 0.500). The posterior probability 1/2 is equal for the two remaining persons Alice and Bob so it is also not discriminative.

*Cecil then goes to Alice's room and sees she is there. *

This time we have only* one* *univariate marginal* conditional query (Fig. 10):

$$P(Bo | Ba = true, Ce = false, Al = false) = \left(\begin{array}{r} \frac{1}{1} \\ \frac{0}{1} \end{array}\right) = \left(\begin{array}{r} 1.000 \\ 0.000 \end{array}\right) $$

**4. Summary**

The modeled situation also resembles a mini detective story. The probabilistic generative model generates data of a room occupancy. The occupancy of a bathroom depends on the activities of three people. This was modelled by an OR- and XOR-gate depending on the number of persons in the room at the same time. The detective derives the momentary stay of the persons from the proof of a closed door. This can be modelled by conditional inference in Bayesian style.

It can be shown how the initial uncertainty of the inference about the location of a certain person (here: Bob) increases within the framework of the model assumptions to perfect certainty.

## WebPPL-Code

## Output

## References and Further Reading

Barber, David. *Bayesian Reasoning and Machine Learning*, 2012, Cambridge University Press, ISBN: 978-0-521-51814-7

Barber, David. *Bayesian Reasoning and Machine Learning*, 2016; http://web4.cs.ucl.ac.uk/staff/D.Barber/textbook/091117.pdf (visited 2019/01/19)

Pearl, Judea; Glymour, Madelyn, and Jewell, Nicolas P. *Causal Inference In Statistics: A Primer*, 2016, Wiley, ISBN: 9781119186847

Pishro-Nik, Hossein. *Introduction to Probability, Statistics, and Random Processes*, Kappa Research, LLC, 2014, ISBN-13: 978-0990637202

Pishro-Nik, Hossein. *Introduction to Probability, Statistics, and Random Processes*, https://www.probabilitycourse.com/ (visited 2019/01/20)