Visualizing Independent Events for Probabilities: No More Venn Diagrams

Brandon Singleton
Feb 20, 2020
10 min read

One of the most interesting aspects of science is determining how events are related. Did the events happen together by cause, or did they happen independently? Does low atmospheric pressure cause rain? Does playing outside in the cold lead you to catch a cold? Am I at higher risk for diabetes if my father has diabetes?

One way to answer these questions is with probabilities. Probabilities help us measure how likely it is that two events occur together. If two events are closely related, then their probabilities will show it. As soon as one event occurs, the other becomes much more likely (or perhaps much less likely). If two events are completely unrelated, their probabilities will also show it. It won't matter whether the first event has occurred or not occurred; the chances of the second event remain exactly the same in either case.

I'm going to explain the basic mathematics of using probabilities to test whether two events are independent. I will use a visual diagram to make the concepts clear. Most people use Venn diagrams for probability problems. But Venn diagrams do not illustrate independent events very clearly, so I designed a modified version of the Venn diagram. It uses squares and rectangles instead of circles. It makes it possible to visually see whether two events are independent or not in a single glance. Let's take a look!

Caution: Don't Use Venn Diagrams

Before we get to my model, I'll explain why I don't use Venn diagrams.

Circles are difficult to draw symmetrically without a compass. Rectangles are easier, especially on graph paper. The same is true of graphing with technology—rectangles are easier than circles.
Venn diagrams are hard to construct in correct proportion to the probabilities they represent. If circle A needs to represent 1/3 and circle B needs to represent 1/2, how big should their radii be? That's an annoying math problem getting in the way of our focus problem. Let's steer clear of irrational numbers.
The intersection of two circles makes a crescent shape, which is hard to compare to the complete circles. So, even if a Venn diagram is accurately drawn to scale, it's hard to visually see the proportions we are interested in.
One last point is not unique to Venn diagrams but we need to get it out of the way. If event A and event B are independent and unrelated, shouldn't the circles be apart with no overlap? Nope! Disjoint events are events that can never happen together. (Example: You can't have your cake and eat it too.) Those events have disjoint circles. Independent events are those that sometimes happen together (hence the overlap) but they do so randomly rather than due to a relationship.

Determining if events are independent by simply looking at a Venn diagram is impossible. You have to calculate and compare ratios between the sections in the diagram. I've devised a new visual model that indicates immediately whether two events are independent.

Setting Things Up

In our model, we are going to represent probabilities as geometric areas. The highest probability an event can have is 100% or 1, so we'll construct a reference square with area 1. Let's put it on the coordinate plane with one corner on the origin. The opposite corner is at (1,1).

Now we need to imagine two events, A and B. Let's say A is the probability of eating an apple during my meal, and B is the probability of brushing my teeth after the meal. We need to insert each of these events into the square model. We're avoiding circles, remember. After some trial and error, I realized that the easiest way to do it is to assign A to the x-axis and B to the y-axis. If the probability of A is 1/4 (I eat an apple during one fourth of my meals), then we'll mark 1/4 on the x-axis. We'll draw a vertical line segment at x = 1/4 and shade the rectangle formed for event A.

Now, let's say I brush my teeth after one third of my meals, so the probability of B is 1/3. This time we'll go to 1/3 on the y-axis and draw a horizontal line, shading the bar produced for event B.

Now combine both events in one diagram. In the lower left corner we see the purple overlap in shading that designates the intersection of A and B. The area of that rectangle measures how likely it is that I eat an apple for my meal and brush my teeth afterward.

Calculating Probabilities

Using our area model, we can talk about probabilities of various scenarios. We already know that A has probability 1/4. We see that A shades 1/4 of the original square. The square represents 100% of the outcomes (all meals) and A represents the 25% of meals during which I eat an apple. Similarly, B represents the 1/3 of meals after which I brush my teeth. How about the white square, where I neither eat an apple nor brush my teeth? Its sides measure 3/4 horizontally and 2/3 vertically. The product is 3/4 ⋅ 2/3 = 1/2. In half of my meals, I neither eat an apple nor brush my teeth (my dentist would be disappointed).

We can calculate other probabilities too. Let's disregard all the meals in which I eat apples by blackening out region A. Now we have a new probability space with a reduced total size of 3/4. In this reduced space, how often do I brush my teeth? The area of B is 3/4 as much as it used to be (which was 1/3), so 3/4 ⋅ 1/3 = 1/4. A probability is calculated just like any fraction: How much you have, or want, goes in the numerator, and the total goes in the denominator. That means the probability of B is (1/4) ÷ (3/4) or .25 / .75, which is 1/3.

What if we switch to meals when I did eat an apple? We can blacken out the non-apple meals, and again look at teeth brushing. The total area of apple-eating meals is 1/4 and within that, the teeth brushing region has area 1/4 ⋅ 1/3 = 1/12. The probability of brushing my teeth given that I ate an apple is (1/12) ÷ (1/4) = 1/3.

The process of blackening out part of the sample space and calculating probabilities in a reduced sample space is called conditional probability. We've now seen the reasoning behind the algebraic definition. The meaning of P(B | A) is the probability of B given the condition that A has occurred, as shown in the picture directly above. The algebraic definition is

P(B | A) = P(B ∩ A) / P(A).

Can you relate the parts of this expression to the picture?

Independence in our Model

Have you noticed something interesting? The probability of brushing teeth has been 1/3 no matter whether we consider all meals, non-apple meals, or apple meals! This makes sense, because all we are doing is blackening out vertical bars. It doesn't matter how wide or narrow the black bar is; whatever remains is still partitioned horizontally, at exactly 1/3 of the height. Visually, we can tell that the teeth-brushing portion will be 1/3 of the whole.

We could do the same analysis, but this time restricting the meals to brushing and non-brushing subsets using horizontal black bars. Try it! The probability of eating an apple should remain constant, at 1/4.

Now it's time to tip my hand. By setting up our visual aid in the way that we have, we have actually assumed or forced events A and B to be independent. The definition is, A and B are independent if the occurrence of A has no effect on the probability of B and vice versa. Algebraically, P(B | A) = P(B) and P(A | B) = P(A).

How does this all work? When we assigned event A to the x-axis as a blue-shaded vertical bar, and B to the y-axis as a red-shaded horizontal bar, we automatically ensured that the definition of independent events would be met. Because, as you saw, blackening out vertical bars has no effect on the ratios of the horizontal partitions. Likewise, blackening out horizontal bars has no effect on the ratios of vertical partitions. The red portion is always 1/3 of every vertical slice, and the blue portion is always 1/4 of every horizontal slice. Cool!

Testing for Independence

We are finally ready to talk about how to verify whether events A and B are, in fact, independent. We pretended they were when we constructed our model above. So, let's continue to pretend that they are independent for a moment. If that is true, then what is the probability of my eating an apple and brushing my teeth on a given meal? We need to calculate the area of intersection. The area model makes this step quite clear. We multiply the two probabilities together. We are finding 1/3 of 1/4 of the whole square, or 1/4 of 1/3 of the square; either order works. Thus, the probability of A and B together is 1/12.

P(A ∩ B) = 1/12.

Our model has helped us understand an important theorem in probability. If two events, A and B, are independent, then P(A ∩ B) = P(A) ⋅ P(B). This is called the multiplication rule (see Zwanch, 2019 for a proof and discussion).

Many people are taught the multiplication rule of calculating independent events. It works nicely with dice rolls—the chances of rolling an odd number and then a four are 1/2 ⋅ 1/6 = 1/12. But it isn't always clear why we multiply or how the multiplication relates to the definition of independent events. The reasons are demonstrated in our model. We derived the model by forcing A and B to be independent, putting them on different axes so that the occurrence of each has no effect on the probability of the other. The model then shows that their intersection is a fractional part of a fractional part, found using multiplication. (You can think of it as a rectangular area with length and width, too.)

All of this modeling has applied to independent events only. What if A and B are not actually independent? The only way to know for certain is to measure empirically how often A and B occur together.

So, let's say that I collect my data on apple eating and tooth brushing. I eat apples at 1/4 of my meals, as we said before, and I brush my teeth after 1/3 of my meals, but the fraction of meals at which I do both of those things is 1/24.

Remember, if apple eating and teeth brushing were entirely independent (randomly distributed), then we'd expect to do both at 1/12 of meals (as we saw in the model). But in reality I only do so during 1/24 of my meals. Then apple eating and tooth brushing are not independent. Since the true value of 1/24 is smaller than the model's predicted value of 1/12, we could say that the events are negatively related. Knowing that I've eaten an apple makes it less likely than usual that I'll brush my teeth, and vice versa. There's a systematic pattern that is causing the two events to occur together with less frequency than we otherwise expect knowing the chance of each event happening alone.

So the way to test if events are independent is to pretend that they are, calculate the likelihood of their occurring together (with the multiplication rule), and then compare it against the actual measure of how often they occur together. If P(A ∩ B) ≠ P(A) ⋅ P(B) then the events are not independent.

As dutiful scientists, we can conjecture why or how the events are related. Maybe apples are convenient lunch snacks and tooth brushing is less convenient during lunch at work, so they only occur together in the rare instances that I use my toothbrush at work or I eat an apple for dinner. Or maybe weekend meals have something to do with it.

Modeling Events that are Not Independent

It's easiest to model independent events using the method above. But now that we know A and B are not independent, we can still adjust the model to indicate the true relationship. We'll pick one event, A, to leave alone and adjust B.

We need to modify the diagram so that the intersection of A and B is 1/24 instead of 1/12. Pretend that the vertical line at x = 1/4 cuts region B into two sections. We're going to reduce the left section down to 1/24, a negative change of 1/12 in area (half as much). To compensate, we'll elevate the right section until it acquires 1/12 in area. Now we have accurately represented the relationship between A and B.

We can calculate some probabilities again. The overall likelihood of brushing teeth is 1/24 + 7/24 = 1/3, the same as before. But the amount of tooth-brushing that overlaps apple eating is less because we adjusted and compensated. Let's calculate conditional probabilities. First, suppose we know that we have eaten an apple (vertical blue+purple bar). We imagine blackening out the rest. The probability of brushing our teeth is (1/24) ÷ (1/4) = 1/6. Now suppose we know we have not eaten an apple. We blacken the correct section, and the probability of brushing our teeth is (7/24) ÷ (3/4) = 4/9. That's close to half the time, a lot more than when we do eat apples. So, we see how the two events influence one another in the conditional probabilities.

Summary

By modifying the Venn diagram representation into rectangles on the unit-square of the coordinate plane, we are able to assign each event to a unique axis and model them as independent events. The model implies the multiplication rule for calculating the probability of independent events occurring together. We test whether two events are independent by checking if the empirical value of P(A ∩ B) equals the derived value P(A) ⋅ P(B) from the model. If the values do not match, we conclude that the events are not independent. Either the value of P(A ∩ B) is higher than predicted, in which case the events are positively related (each makes the other more likely), or the value is lower than predicted and the events are negatively related (each makes the other less likely).

The square model is much easier to read than the Venn diagram. We can instantly see how strong or extreme the relationship is between the events. Event A cuts event B into a left and right column. If the discrepancy between rectangle heights in each column is very large, the events have a strong relationship. If the discrepancy is small, the relationship is weak and the events are nearly independent. If the heights are exactly the same, then A and B are independent. All of this can be determined in a single glance!

Two-way tables are another common way to represent probabilities of events. The spatial layout of a two-way table matches up nicely with our area model. A student could use the two-way table and some graph paper to draw the square model.

I searched the web far and wide for a similar visual model of probabilistic events and could not find it. I therefore have no idea what to call the model. A Venn diagram but with squares… Svenn diagram?

(Photo by Arseny Togulev on Unsplash)

References

Zwanch, K. (2019). A preliminary genetic decomposition of probabilistic independence. The Mathematics Educator, 28(1), 3-26. Retrieved from http://tme.coe.uga.edu