Introduction
In the last section, we established the five basic rules of probability, which include the two restricted versions of the Addition Rule and Multiplication Rule: The Addition Rule for Disjoint Events and the Multiplication Rule for Independent Events. We have also established a General Addition Rule for which the events need not be disjoint. In order to complete our set of rules, we still require a General Multiplication Rule for which the events need not be independent. In order to establish such a rule, however, we first need to understand the important concept of conditional probability.

This section will be organized as follows: We’ll first introduce the idea of conditional probability, and use it to formalize our definition of independent events, which in the first module was presented only in an intuitive way. We will then develop the General Multiplication Rule, a rule that will tell us how to find P(A and B) in cases when the events A and B are not necessarily independent. We’ll conclude with a discussion of probability trees, a method of displaying conditional probability visually that is very helpful in solving problems.
Another way to visualize conditional probability is using a Venn diagram:

In both the two-way table and the Venn diagram, the reduced sample space (comprised of only males) is shaded light green, and within this sample space, the event of interest (having ears pierced) is shaded darker green. The two-way table illustrates the idea via counts, while the Venn diagram converts the counts to probabilities, which are presented as regions rather than cells.
We may work with counts, as presented in the two-way table, to write
P(E | M) = 36/180.
Or we can work with probabilities, as presented in the Venn diagram, by writing
P(E | M) = (36/500) / (180/500).
We will want, however, to write our formal expression for conditional probabilities in terms of other, ordinary, probabilities and therefore the definition of conditional probability will grow out of the Venn diagram.
Notice that
P(E | M) = (36/500) / (180/500) = P(M and E) / P(M). Generalized, we have a formal definition of conditional probability:conditional probability(definition)
The conditional probability of event B, given event A, is P(B | A) = P(A and B) / P(A)
Comments
1. Note that when we evaluate the conditional probability, we always divide by the probability of the given event. The probability of both goes in the numerator.
2. The above formula holds as long as P(A) > 0, since we cannot divide by 0. In other words, we should not seek the probability of an event given that an impossible event has occurred.
Let’s see how we can use this formula in practice:
Example
On the “Information for the Patient” label of a certain antidepressant, it is claimed that based on some clinical trials, there is a 14% chance of experiencing sleeping problems known as insomnia (denote this event by I), there is a 26% chance of experiencing headache (denote this event by H), and there is a 5% chance of experiencing both side effects (I and H).
(a) Suppose that the patient experiences insomnia; what is the probability that the patient will also experience headache?
Since we know (or it is given) that the patient experienced insomnia, we are looking for P(H | I). According to the definition of conditional probability:
P(H | I) = P(H and I) / P(I) = 0.05/0.14 = 0.357.
(b) Suppose the drug induces headache in a patient; what is the probability that it also induces insomnia?
Here, we are given that the patient experienced headache, so we are looking for P(I | H).
Using the definition P(I | H) = P(I and H) / P(H) = 0.05/0.26 = 0.1923.
Comment
Note that the answers to (a) and (b) above are different. In general, P(A | B) does not equal P(B | A). We’ll come back and illustrate this point later in this module.
The purpose of the following activity is to give you guided practice in using the definition of conditional probability, and teach you how the Complement Rule works with conditional probability.
Compare P(B | A) and P(B)
As we saw in the Exploratory Data Analysis section, whenever a situation involves more than one variable, it is generally of interest to determine whether or not the variables are related. In probability, we talk about independent events, and in the first module we said that two events A and B are independent if event A occurring does not affect the probability that event B will occur. Now that we’ve introduced conditional probability, we can formalize the definition of independence of events and develop four simple ways to check whether two events are independent or not. We will introduce these “independence checks” using examples, and then summarize.
Consider again the two-way table for all 500 students in a particular high school, classified according to gender and whether or not they have one or both ears pierced.
Gender | Pierced | Not Pierced | Total |
---|---|---|---|
Male | 36 | 144 | 180 |
Female | 288 | 32 | 320 |
Total | 324 | 176 | 500 |
Would you expect those two variables to be related? That is, would you expect having pierced ears to depend on whether the student is male or female? Or, to put it yet another way, would knowing a student’s gender affect the probability that the student’s ears are pierced? To answer this, we may compare the overall probability of having pierced ears to the conditional probability of having pierced ears, given that a student is male. Our intuition would tell us that the latter should be lower: male students tend not to have their ears pierced, whereas female students do. Indeed, for students in general, the probability of having pierced ears (event E) is P(E) = 324/500 = 0.648. But the probability of having pierced ears given that a student is male is only P(E | M) = 36/180 = 0.20.
As we anticipated, P(E | M) is lower than P(E). The probability of a student having pierced ears changes (in this case, gets lower) when we know that the student is male, and therefore the events E and M are dependent. (If E and M were independent, knowing or not knowing that the student is male would not have made a difference … but it did.)
This example illustrates that one method for determining whether two events are independent is to compare P(B | A) and P(B).
If the two are equal (i.e., knowing or not knowing whether A has occurred has no effect on the probability of B occurring) then the two events are independent. Otherwise, if the probability changes depending on whether we know that A has occurred or not, then the two events are not independent. Similarly, using the same reasoning, we can compare P(A | B) and P(A).
The General Multiplication Rule: Defined
Now that we have an understanding of conditional probabilities and can express them with concise notation, and have a more formal understanding of what it means for two events to be independent, we can finally establish the General Multiplication Rule, a formal rule for finding P(A and B) that applies to any two events, whether they are independent or dependent.
Probability Trees: Defined
So far, when two categorical variables are involved, we have displayed counts or probabilities for various events with two-way tables and with Venn diagrams. Another display tool, called a probability tree, is particularly useful for showing probabilities when the events occur in stages and conditional probabilities are involved.
Example
A sales representative tells his friend that the probability of landing a major contract by the end of the week, resulting in a large commission, is .4. If the commission comes through, the probability that he will indulge in a weekend vacation in Bermuda is .9. Even if the commission doesn’t come through, he may still go to Bermuda, but only with probability .3.
First, let’s identify the given probabilities for events involving C (the commission comes through) and V (the sales rep takes a Bermuda vacation):
P(C) = 0.4 [and so P(not C) = 0.6],
P(V | C) = 0.9 [and so P(not V | C) = 0.1], and
P(V | not C) = 0.3 [and so P(not V | not C) = 0.7.]
There are two stages in the problem. First, the sales rep will either get the commission or not.
Second, based on what happened in the first stage, the sales rep will either take the Bermuda vacation or not.
We follow exactly the same reasoning when we build the probability tree.
There are two important things to note here:
1. The probabilities in the first branch-off are non-conditional probabilities P(C) = 0.4, P(not C) = 0.6. However, the probabilities that appear in the second branch-off are conditional probabilities. The top two branches assume that C occurred: P(V | C) = 0.9, P(not V | C) = 0.1. The bottom two branches assume that not C occurred: P(V | not C) = 0.3, P(not V | not C) = 0.7

2. The second thing to note is that probabilities of branches that branch out from the same point always add up to one.
