# Conditional Probability

188K+ views   |   2K+ likes   |   63 dislikes   |
12:28   |   Jan 21, 2014

### Thumbs    ### Transcription

• You've tested positive for a rare and deadly cancer that afflicts 1 out of 1000 people,
• based on a test that is 99% accurate. What are the chances that you actually have the
• cancer? By the end of this video, you'll be able to answer this question!
• This video is part of the Probability and Statistics video series. Many natural and
• social phenomena are probabilistic in nature. Engineers, scientists, and policymakers often
• use probability to model and predict system behavior.
• Hi, my name is Sam Watson, and I'm a graduate student in mathematics at MIT.
• Before watching this video, you should be familiar with basic probability vocabulary
• and the definition of conditional probability.
• After watching this video, you'll be able to: Calculate the conditional probability
• of a given event using tables and trees; and Understand how conditional probability can
• be used to interpret medical diagnoses.
• Suppose that in front of you are two bowls, labeled A and B. Each bowl contains five marbles.
• Bowl A has 1 blue and 4 yellow marbles. Bowl B has 3 blue and 2 yellow marbles.
• Now choose a bowl at random and draw a marble uniformly at random from it. Based on your
• existing knowledge of probability, how likely is it that you pick a blue marble? How about
• a yellow marble?
• Out of the 10 marbles you could choose from, 4 are blue. So the probability of choosing a blue
• marble is 4 out of 10.
• There are 6 yellow marbles out of 10 total, so the probability of choosing yellow is 6
• out of 10.
• When the number of possible outcomes is finite, and all events are equally likely, the probability
• of one event happening is the number of favorable outcomes divided by the total number of possible
• outcomes.
• What if you must draw from Bowl A? What's the probability of drawing a blue marble,
• given that you draw from Bowl A?
• Let's go back to the table and consider only Bowl A. Bowl A contains 5 marbles of which
• 1 is blue, so the probability of picking a blue one is 1 in 5.
• Notice the probability has changed. In the first scenario, the sample space consists
• of all 10 marbles, because we are free to draw from both bowls.
• In the second scenario, we are restricted to Bowl A. Our new sample space consists of
• only the five marbles in Bowl A. We ignore these marbles in Bowl.
• Restricting our attention to a specific set of outcomes changes the sample space, and
• can also change the probability of an event. This new probability is what we call a conditional
• In the previous example, we calculated the conditional probability of drawing a blue
• marble, given that we draw from Bowl A.
• This is standard notation for conditional probability. The vertical bar ( | ) is read
• as "given." The probability we are looking for precedes the bar, and the condition follows
• the bar.
• Now let's flip things around. Suppose someone picks a marble at random from either bowl
• A or bowl B and reveals to you that the marble drawn was blue. What is the probability that
• the blue marble came from Bowl A?
• In other words, what's the conditional probability that the marble was drawn from Bowl A, given
• that it is blue? Pause the video and try to work this out.
• Going back to the table, because we are dealing with the condition that the marble is blue,
• the sample space is restricted to the four blue marbles.
• Of these four blue marbles, one is in Bowl A, and each is equally likely to be drawn.
• Thus, the conditional probability is 1 out of 4.
• Notice that the probability of picking a blue marble given that the marble came from Bowl
• A is NOT equal to the probability that the marble came from Bowl A given that the marble
• was blue. Each has a different condition, so be careful not to mix them up!
• We've seen how tables can help us organize our data and visualize changes in the sample
• space.
• Let's look at another tool that is useful for understanding conditional probabilities
• - a tree diagram.
• Suppose we have a jar containing 5 marbles; 2 are blue and 3 are yellow. If we draw any
• one marble at random, the probability of drawing a blue marble is 2/5.
• Now, without replacing the first marble, draw a second marble from the jar. Given that the
• first marble is blue, is the probability of drawing a second blue marble still 2/5?
• NO, it isn't. Our sample space has changed. If a blue marble is drawn first, you are left
• with 4 marbles; 1 blue and 3 yellow.
• In other words, if a blue marble is selected first, the probability that you draw blue
• second is 1/4. And the probability you draw yellow second is 3/4.
• Now pause the video and determine the probabilities if the yellow marble is selected first instead.
• If a yellow marble is selected first, you are left with 2 yellow and 2 blue marbles.
• There is now a 2/4 chance of drawing a blue marble and a 2/4 chance of drawing a yellow
• marble.
• What we have drawn here is called a tree diagram. The probability assigned to the second branch
• denotes the conditional probability given that the first happened.
• Tree diagrams help us to visualize our sample space and reason out probabilities.
• We can answer questions like "What is the probability of drawing 2 blue marbles in a
• row?" In other words, what is the probability of drawing a blue marble first AND a blue
• marble second?
• This event is represented by these two branches in the tree diagram.
• We have a 2/5 chance followed by a 1/4 chance. We multiply these to get 2/20, or 1/10. The
• probability of drawing two blue marbles in a row is 1/10.
• Now you do it. Use the tree diagram to calculate the probabilities of the other possibilities:
• blue, yellow; yellow, blue; and yellow, yellow.
• The probabilities each work out to 3/10. The four probabilities add up to a total of 1,
• as they should.
• What if we don't care about the first marble? We just want to determine the probability
• that the second marble is yellow.
• Because it does not matter whether the first marble is blue or yellow, we consider both
• the blue, yellow, and the yellow, yellow paths. Adding the probabilities gives us 3/10 + 3/10,
• which works out to 3/5.
• Here's another interesting question. What is the probability that the first marble drawn
• is blue, given that the second marble drawn is yellow?
• Intuitively, this seems tricky. Pause the video and reason through the probability tree
• with a friend.
• Because we are conditioning on the event that the second marble drawn is yellow, our sample
• space is restricted to these two paths: P(blue, yellow) and P(yellow, yellow).
• Of these two paths, only the top one meets our criteria - that the blue marble is drawn
• first.
• We represent the probability as a fraction of favorable to possible outcomes. Hence,
• the probability that the first marble drawn is blue, given that the second marble drawn
• is yellow is 3/10 divided by (3/10 +3/10), which works out to 1/2.
• I hope you appreciate that tree diagrams and tables make these types of probability problems
• doable without having to memorize any formulas!
• Let's return to our opening question. Recall that you've tested positive for a cancer that
• afflicts 1 out of 1000 people, based on a test that is 99% accurate.
• More precisely, out of 100 test results, we expect about 99 correct results and only 1
• incorrect result.
• Since the test is highly accurate, you might conclude that the test is unlikely to be wrong,
• and that you most likely have cancer.
• But wait! Let's first use conditional probability to make sense of our seemingly gloomy diagnosis.
• Now pause the video and determine the probability that you have the cancer, given that you test
• positive.
• Let's use a tree diagram to help with our calculations.
• The first branch of the tree represents the likelihood of cancer in the general population.
• The probability of having the rare cancer is 1 in 1000, or 0.001. The probability of
• having no cancer is 0.999.
• Let's extend the tree diagram to illustrate the possible results of the medical test that
• is 99% accurate.
• In the cancer population, 99% will test positive (correctly), but 1% will test negative (incorrectly).
• These incorrect results are called false negatives.
• In the cancer-free population, 99% will test negative (correctly), but 1% will test positive
• (incorrectly). These incorrect results are called false positives.
• Given that you test positive, our sample space is now restricted to only the population that
• test positive. This is represented by these two paths.
• The top path shows the probability you have the cancer AND test positive. The lower path
• shows the probability that you don't have cancer AND still test positive.
• The probability that you actually do have the cancer, given that you test positive, is (0.001*0.99)/((0.001*0.99)+(0.999*0.01)),
• which works out to about 0.09 - less than 10%!
• The error rate of the test is only 1 percent, but the chance of a misdiagnosis is more than
• 90%! Chances are pretty good that you do not actually have cancer, despite the rather accurate
• test. Why is this so?
• The accuracy of the test actually reflects the conditional probability that one tests
• positive, given that one has cancer.
• But in practice, what you want to know is the conditional probability that you have
• cancer, given that you test positive! These probabilities are NOT the same!
• Whenever we take medical tests, or perform experiments, it is important to understand
• what events our results are conditioned on, and how that might affect the accuracy of
• our conclusions.
• In this video, you've seen that conditional probability must be used to understand and
• predict the outcomes of many events. You've also learned to evaluate and manage conditional
• probabilities using tables and trees.
• We hope that you will now think more carefully about the probabilities you encounter, and
• consider how conditioning affects their interpretation.

### Description

MIT RES.TLL-004 Concept Vignettes
View the complete course: http://ocw.mit.edu/RES-TLL-004F13
Instructor: Sam Watson

This video provides an introduction to conditional probability and its calculations, as well as how it can be used to interpret medical diagnoses.