# 6. Monte Carlo Simulation

496K+ views   |   6K+ likes   |   122 dislikes   |
00:00   |   May 19, 2017

### Thumbs    ### Transcription

• The following content is provided under a Creative
• Your support will help MIT OpenCourseWare
• To make a donation or to view additional materials
• from hundreds of MIT courses, visit MIT OpenCourseWare
• at ocw.mit.edu.
• JOHN GUTTAG: Welcome to Lecture 6.
• As usual, I want to start by posting some relevant reading.
• For those who don't know, this lovely picture
• is of the Casino at Monte Carlo, and shortly you'll
• see why we're talking about casinos and gambling today.
• Not because I want to encourage you to gamble your life
• savings away.
• A little history about Monte Carlo simulation,
• which is the topic of today's lecture.
• The concept was invented by the Polish American mathematician,
• Stanislaw Ulam.
• Probably more well known for his work on thermonuclear weapons
• than on mathematics, but he did do
• a lot of very important mathematics
• earlier in his life.
• The story here starts that he was ill,
• recovering from some serious illness,
• and was home and was bored and was
• playing a lot of games of solitaire, a game I
• suspect you've all played.
• Being a mathematician, he naturally wondered,
• what's the probability of my winning this stupid game which
• I keep losing?
• And so he actually spent quite a lot of time
• trying to work out the combinatorics,
• so that he could actually compute the probability.
• And despite being a really amazing mathematician,
• he failed.
• The combinatorics were just too complicated.
• So he thought, well suppose I just play lots of hands
• and count the number I win, divide by the number
• of hands I played.
• Well then he thought about it and said,
• well, I've already played a lot of hands and I haven't won yet.
• So it probably will take me years
• to play enough hands to actually get a good estimate,
• and I don't want to do that.
• So he said, well, suppose instead of playing the game,
• I just simulate the game on a computer.
• He had no idea how to use a computer,
• but he had friends in high places.
• And actually talked to John von Neumann,
• who is often viewed as the inventor of the stored program
• computer.
• And said, John, could you do this on your fancy new ENIAC
• machine?
• And on the lower right here, you'll
• see a picture of the ENIAC.
• It was a very large machine.
• It filled a room.
• And von Neumann said, sure, we could probably
• do it in only a few hours of computation.
• Today we would think of a few microseconds,
• but those machines were slow.
• Hence was born Monte Carlo simulation,
• and then they actually used it in the design of the hydrogen
• bomb.
• So it turned out to be not just useful for cards.
• So what is Monte Carlo simulation?
• It's a method of estimating the values
• of an unknown quantity using what is
• called inferential statistics.
• And we've been using inferential statistics
• for the last several lectures.
• The key concepts-- and I want to be careful about these things
• will be coming back to them--
• are the population.
• So think of the population as the universe
• of possible examples.
• So in the case of solitaire, it's
• a universe of all possible games of solitaire
• that you could possibly play.
• I have no idea how big that is, but it's really big,
• Then we take that universe, that population,
• and we sample it by drawing a proper subset.
• Proper means not the whole thing.
• Usually more than one sample to be useful.
• Certainly more than 0.
• And then we make an inference about the population
• based upon some set of statistics we do on the sample.
• So the population is typically a very large set of examples,
• and the sample is a smaller set of examples.
• And the key fact that makes them work
• is that if we choose the sample at random,
• the sample will tend to exhibit the same properties
• as the population from which it is drawn.
• And that's exactly what we did with the random walk, right?
• There were a very large number of different random walks
• you could take of say, 10,000 steps.
• We didn't look at all possible random walks of 10,000 steps.
• We drew a small sample of, say 100 such walks,
• computed the mean of those 100, and said,
• we think that's probably a good expectation
• of what the mean would be of all the possible walks of 10,000
• steps.
• So we were depending upon this principle.
• And of course the key fact here is that the sample
• has to be random.
• If you start drawing the sample and it's not random,
• then there's no reason to expect it
• to have the same properties as that of the population.
• And we'll go on throughout the term,
• and talk about the various ways you can get fooled and think
• of a random sample when exactly you don't.
• All right, let's look at a very simple example.
• People like to use flipping coins because coins are easy.
• So let's assume we have some coin.
• All right, so I bought two coins slightly larger
• than the usual coin.
• And I can flip it.
• Flip it once, and let's consider one flip,
• and let's assume it came out heads.
• I have to say the coin I flipped is not actually a \$20 gold
• piece, in case any of you were thinking of stealing it.
• All right, so we've got one flip, and it came up heads.
• And now I can ask you the question--
• if I were to flip the same coin an infinite number of times,
• that all infinite flips would be heads?
• Or even if I were to flip it once more,
• how confident would you be that the next flip would be heads?
• And the answer is not very.
• Well, suppose I flip the coin twice,
• and both times it came up heads.
• And I'll ask you the same question--
• do you think that the next flip is likely to be heads?
• Well, maybe you would be more inclined to say yes
• and having only seen one flip, but you wouldn't really
• On the other hand, if I flipped it 100 times and all 100 flips
• came up heads, well, you might be suspicious
• that my coin only has a head on both sides, for example.
• Or is weighted in some funny way that it mostly comes up heads.
• And so a lot of people, maybe even me, if you said,
• I flipped it 100 times and it came up heads.
• What do you think the next one will be?
• My best guess would be probably heads.
• So here I've simulated 100 flips,
• and we have 50 heads here, two heads here, And 48 tails.
• And now if I said, do you think that the probability
• of the next flip coming up heads--
• is it 52 out of 100?
• Well, if you had to guess, that should be the guess you make.
• Based upon the available evidence,
• that's the best guess you should probably make.
• You have no reason to believe it's a fair coin.
• It could well be weighted.
• We don't see it with coins, but we see weighted dice
• all the time.
• We shouldn't, but they exist.
• You can buy them on the internet.
• So typically our best guess is what we've seen,
• but we really shouldn't have very much confidence
• in that guess.
• Because well, could've just been an accident.
• Highly unlikely even if the coin is fair
• that you'd get 50-50, right?
• So why when we see 100 samples and they all come up heads
• than we did when we saw two samples?
• And why don't we feel so good about guessing 52 out of 100
• when we've seen a hundred flips that came out 52 and 48?
• And the answer is something called variance.
• I got the same answer all the time.
• And so there was no variability, and that intuitively--
• and in fact, mathematically-- should make us feel confident
• that, OK, maybe that's really the way the world is.
• On the other hand, when almost half are heads and almost half
• are tails, there's a lot of variance.
• Right, it's hard to predict what the next one will be.
• And so we should have very little confidence
• that it isn't an accident that it happened
• to be 52-48 in one direction.
• So as the variance grows, we need larger samples
• to have the same amount of confidence.
• All right, let's look at that with a detailed example.
• We'll look at roulette in keeping with the theme of Monte
• Carlo simulation.
• This is a roulette wheel that could well be at Monte Carlo.
• There's no need to simulate roulette, by the way.
• It's a very simple game, but as we've
• seen with our earlier examples, it's
• nice when we're learning about simulations to simulate things
• where we actually can know what the actual answer is
• so that we can then understand our simulation better.
• For those of you who don't know how roulette is played--
• is there anyone here who doesn't know how roulette is played?
• Good for you.
• You grew up virtuous.
• All right, so-- well all right.
• Maybe I won't go there.
• So you have a wheel that spins around,
• and in the middle are a bunch of pockets.
• Each pocket has a number and a color.
• You bet in advance on what number
• you think is going to come up, or what color you
• think is going to come up.
• Then somebody drops a ball in that wheel, gives it a spin.
• And through centrifugal force, the ball
• stays on the outside for a while.
• But as the wheel slows down and heads towards the middle,
• and eventually settles in one of those pockets.
• And you win or you lose.
• Now you can bet on it, and so let's look
• at an example of that.
• So here is a roulette game.
• I've called it fair roulette, because it's
• set up in such a way that in principle, if you bet,
• your expected value should be 0.
• You'll win some, you'll lose some,
• but it's fair in the sense that it's not either
• a negative or positive sum game.
• So as always, we have an underbar underbar in it.
• Well we're setting up the wheel with 36 pockets on it,
• so you can bet on the numbers 1 through 36.
• That's way range work, you'll recall.
• Initially, we don't know where the ball is,
• so we'll say it's none.
• And here's the key thing is, if you make a bet,
• this tells you what your odds are.
• That if you bet on a pocket and you win,
• you get [? len ?] of pockets minus 1.
• So This is why it's a fair game, right?
• You bet \$1.
• If you win, you get \$36, your dollar plus \$35 back.
• If you lose, you lose.
• All right, self dot spin will be random dot
• choice among the pockets.
• And then there is simply bet, where you just
• can choose an amount to bet and the pocket you want to bet on.
• I've simplified it.
• I'm not allowing you to bet here on colors.
• All right, so then we can play it.
• So here is play roulette.
• I've made game the class a parameter,
• because later we'll look at other kinds of roulette games.
• You tell it how many spins.
• What pocket you want to bet on.
• For simplicity, I'm going to bet on this same pocket
• all the time.
• Pick your favorite lucky number and how much you want to bet,
• and then we'll have a simulation just like the ones
• So the number you get right starts at 0.
• For I and range number of spins, we'll do a spin.
• And then tote pocket plus equal game dot that pocket.
• And it will come back either 0 if you've lost,
• or 35 if you've won.
• And then we'll just print the results.
• So we can do it.
• In fact, let's run it.
• So here it is.
• I guess I'm doing a million games here, so quite a few.
• Actually I'm going to do two.
• What happens when you spin it 100 times?
• What happens when you spin it a million times?
• And we'll see what we get.
• So what we see here is that we do 100 spins.
• The first time I did it my expected return was minus 100%.
• I lost everything I bet.
• Not so unlikely, given that the odds
• are pretty long that you could do 100 times without winning.
• Next time I did a 100, my return was a positive 44%, and then
• a positive 28%.
• So you can see, for 100 spins it's highly variable what
• the expected return is.
• That's one of the things that makes
• gambling attractive to people.
• If you go to a casino, 100 spins would be a pretty long night
• at the table.
• And maybe you'd won 44%, and you'd
• feel pretty good about it.
• What about a million spins?
• Well people aren't interested in that, but the casino is, right?
• They don't really care what happens with 100 spins.
• They care what happens with a million spins.
• What happens when everybody comes every night to play.
• And there what we see is--
• you'll notice much less variance.
• Happens to be minus 0.04 plus 0.6 plus 0.79.
• So it's still not 0, but it's certainly,
• these are all closer to 0 than any of these are.
• We know it should be 0, but it doesn't
• happen to be in these examples.
• But not only are they closer to 0, they're closer together.
• There is much less variance in the results, right?
• So here I show you these three numbers,
• and ask what do you expect to happen?
• You have no clue, right?
• So I don't know, maybe I'll win a lot.
• Maybe I'll lose everything.
• I show you these three numbers, you're going to look at it
• and say, well you know, I'm going
• to be somewhere between around 0 and maybe 1%.
• But you're never going to guess it's
• going to be radically different from that.
• And if I were to change this number to be even higher,
• it would go even closer to 0.
• But we won't bother.
• OK, so these are the numbers we just
• looked at, because I said the seed to be the same.
• So what's going on here is something
• called the law of large numbers, or sometimes Bernoulli's law.
• This is a picture of Bernoulli on the stamp.
• It's one of the two most important theorems in all
• of statistics, and we'll come to the second most important
• theorem in the next lecture.
• Here it says, "in repeated independent tests
• with the same actual probability, the chance
• that the fraction of times the outcome differs
• from p converges to 0 as the number of trials
• goes to infinity."
• So this says if I were to spin this fair roulette
• wheel an infinite number of times,
• the expected-- the return would be 0.
• The real true probability from the mathematics.
• Well, infinite is a lot, but a million
• is getting closer to infinite.
• And what this says is the closer I get to infinite,
• the closer it will be to the true probability.
• So that's why we did better with a million than with a hundred.
• And if I did a 100 million, we'd do way better
• than I did with a million.
• I want to take a minute to talk about a way this law is
• often misunderstood.
• This is something called the gambler's fallacy.
• And all you have to do is say, let's
• go watch a sporting event.
• And you'll watch a batter strike out
• for the sixth consecutive time.
• The next time they come to the plate,
• the idiot announcer says, well he struck out six times
• in a row.
• He's due for a hit this time, because he's usually
• a pretty good hitter.
• Well that's nonsense.
• It says, people somehow believe that if deviations
• from expected occur, they'll be evened out in the future.
• And we'll see something similar to this that is true,
• but this is not true.
• And there is a great story about it.
• This is told in a book by [INAUDIBLE] and [INAUDIBLE].
• And this truly happened in Monte Carlo, with Roulette.
• And you could either bet on black or red.
• Black came up 26 times in a row.
• Highly unlikely, right?
• 2 to the 26th is a giant number.
• And what happened is, word got out on the casino floor
• that black had kept coming up way too often.
• And people more or less panicked to rush to the table
• to bet on red, saying, well it can't keep coming up black.
• Surely the next one will be red.
• And as it happened when the casino totaled up its winnings,
• it was a record night for the casino.
• Millions of francs got bet, because people were
• sure it would have to even out.
• Well if we think about it, probability
• of 26 consecutive reds is that.
• A pretty small number.
• But the probability of 26 consecutive reds
• when the previous 25 rolls were red is what?
• No, that.
• AUDIENCE: Oh, I thought you meant [INAUDIBLE].
• JOHN GUTTAG: No, if you had 25 reds and then
• you spun the wheel once more, the probability
• of it having 26 reds is now 0.5, because these
• are independent events.
• Unless of course the wheel is rigged, and we're assuming
• it's not.
• People have a hard time accepting this,
• and I know it seems funny.
• But I guarantee there will be some point in the next month
• or so when you will find yourself thinking this way,
• that something has to even out.
• I did so badly on the midterm, I will
• have to do better on the final.
• That was mean, I'm sorry.
• All right, speaking of means--
• see?
• Professor [? Grimm's ?] not the only one
• who can make bad jokes.
• There is something-- it's not the gambler's fallacy--
• that's often confused with it, and that's
• called regression to the mean.
• This term was coined in 1885 by Francis Galton
• in a paper, of which I've shown you a page from it here.
• And the basic conclusion here was--
• what this table says is if somebody's parents are
• both taller than average, it's likely
• that the child will be smaller than the parents.
• Conversely, if the parents are shorter than average,
• it's likely that the child will be taller than average.
• That's not what he did.
• He just looked at a bunch of data,
• and the data actually supported this.
• And this led him to this notion of regression to the mean.
• And here's what it is, and here's
• the way in which it is subtly different from the gambler's
• fallacy.
• What he said here is, following an extreme event--
• parents being unusually tall--
• the next random event is likely to be less extreme.
• He didn't know much about genetics,
• and he kind of assumed the height of people were random.
• But we'll ignore that.
• OK, but the idea is here that it will be less extreme.
• So let's look at it in roulette.
• If I spin a fair roulette wheel 10 times and get 10 reds,
• that's an extreme event.
• Right, here's a probability of basically 1.1024.
• Now the gambler's fallacy says, if I
• were to spin it another 10 times,
• it would need to even out.
• As in I should get more blacks than you would usually
• get to make up for these excess reds.
• What regression to the mean says is different.
• It says, it's likely that in the next 10 spins,
• you will get fewer than 10 reds.
• You will get a less extreme event.
• Now it doesn't have to be 10.
• If I'd gotten 7 reds instead of 5, you'd consider that extreme,
• and you would bet that the next 10 would have fewer than 7.
• But you wouldn't bet that it would have fewer than 5.
• Because of this, if you now look at the average of the 20 spins,
• it will be closer to the mean of 50% reds
• than you got from the extreme first spins.
• So that's why it's called regression to the mean.
• The more samples you take, the more likely
• you'll get to the mean.
• Yes?
• AUDIENCE: So, roulette wheel spins
• are supposed to be independent.
• JOHN GUTTAG: Yes.
• AUDIENCE: So it seems like the second 10--
• JOHN GUTTAG: Pardon?
• AUDIENCE: It seems like the second 10 times
• that you spin it.
• that shouldn't have to [INAUDIBLE].
• JOHN GUTTAG: Has nothing to do with the first one.
• AUDIENCE: But you said it's likely [INAUDIBLE].
• JOHN GUTTAG: Right, because you have an extreme event, which
• was unlikely.
• And now if you have another event,
• it's likely to be closer to the average
• than the extreme was to the average.
• Precisely because it is independent.
• That makes sense to everybody?
• Yeah?
• AUDIENCE: Isn't that the same as the gambler's fallacy, then?
• By saying that, because this was super unlikely,
• the next one [INAUDIBLE].
• JOHN GUTTAG: No, the gambler's fallacy here--
• and it's a good question, and indeed people often
• do get these things confused.
• The gambler's fallacy would say that the second 10
• spins would--
• we would expect to have fewer than 5 reds,
• because you're trying to even out the unusual number of reds
• in the first Spin
• Whereas here we're not saying we would have fewer than 5.
• We're saying we'd probably have fewer than 10.
• That it'll be closer to the mean,
• not that it would be below the mean.
• Whereas the gambler's fallacy would say
• it should be below that mean to quote, even out, the first 10.
• Does that makes sense?
• OK, great questions.
• Thank you.
• All right, now you may not know this,
• but casinos are not in the business of being fair.
• And the way they don't do that is in Europe,
• they're not all red and black.
• They sneak in one green.
• And so now if you bet red, well sometimes
• it isn't always red or black.
• And furthermore, there is this 0.
• They index from 0 rather than from one, and so
• you don't get a full payoff.
• In American roulette, they manage to sneak in two greens.
• They have a 0 in a double 0.
• Tilting the odds even more in favor of the casino.
• So we can do that in our simulation.
• We'll look at European roulette as a subclass of fair roulette.
• I've just added this extra pocket, 0.
• And notice I have not changed the odds.
• So what you get if you get your number is no higher,
• but you're a little bit less likely to get it
• because we snuck in that 0.
• Than American roulette is a subclass of European roulette
• in which I add yet another pocket.
• All right, we can simulate those.
• Again, nice thing about simulations,
• we can play these games.
• So I've simulated 20 trials of 1,000 spins, 10,000 spins,
• 100,000, and a million.
• And what do we see as we look at this?
• Well, right away we can see that fair roulette is usually
• a much better bet than either of the other two.
• That even with only 1,000 spins the return is negative.
• And as we get more and more as I got to a million,
• it starts to look much more like closer to 0.
• And these, we have reason to believe at least,
• are much closer to true expectation
• saying that, while you break even in fair roulette,
• you'll lose 2.7% in Europe and over 5% in Las Vegas,
• or soon in Massachusetts.
• All right, we're sampling, right?
• That's why the results will change,
• and if I ran a different simulation
• with a different seed I'd get different numbers.
• Whenever you're sampling, you can't be guaranteed
• to get perfect accuracy.
• It's always possible you get a weird sample.
• That's not to say that you won't get exactly the right answer.
• I might have spun the wheel twice
• and happened to get the exact right answer of the return.
• Actually not twice, because the math
• doesn't work out, but 35 times and gotten
• But that's not the point.
• We need to be able to differentiate
• between what happens to be true and what we actually know,
• in a rigorous sense, is true.
• Or maybe don't know it, but have real good reason
• to believe it's true.
• So it's not just a question of faith.
• And that gets us to what's in some sense
• the fundamental question of all computational statistics,
• is how many samples do we need to look
• at before we can have real, justifiable confidence
• As we've just seen--
• not just, a few minutes ago-- with the coins,
• our intuition tells us that it depends
• upon the variability in the underlying possibilities.
• So let's look at that more carefully.
• We have to look at the variation in the data.
• So let's look at first something called variance.
• So this is variance of x.
• Think of x as just a list of data examples, data items.
• And the variance is we first compute the average
• of value, that's mu.
• So mu is for the mean.
• For each little x and big X, we compare the difference
• of that and the mean.
• How far is it from the mean?
• And square of the difference, and then we just sum them.
• So this takes, how far is everything from the mean?
• We just add them all up.
• And then we end up dividing by the size of the set,
• the number of examples.
• Why do we have to do this division?
• Well, because we don't want to say something has high variance
• just because it has many members, right?
• So this sort of normalizes is by the number of members,
• and this just sums how different the members are from the mean.
• So if everything is the same value,
• what's the variance going to be?
• If I have a set of 1,000 6's, what's the variance?
• Yes?
• AUDIENCE: 0.
• JOHN GUTTAG: 0.
• You think this is going to be hard, but I came prepared.
• I was hoping this would happen.
• Look out, I don't know where this is going to go.
• [FIRES SLINGSHOT]
• AUDIENCE: [LAUGHTER]
• JOHN GUTTAG: All right, maybe it isn't the best technology.
• I'll go home and practice.
• And then the thing you're more familiar
• with is the standard deviation.
• And if you look at the standard deviation is,
• it's simply the square root of the variance.
• Now, let's understand this a little bit
• and first ask, why am I squaring this here,
• especially because later on I'm just going
• to take a square root anyway?
• Well squaring it has one virtue, which
• is that it means I don't care whether the difference is
• positive or negative.
• And I shouldn't, right?
• I don't care which side of the mean it's on,
• I just care it's not near the mean.
• But if that's all I wanted to do I
• could take the absolute value.
• The other thing we see with squaring
• is it gives the outliers extra emphasis, because I'm
• squaring that distance.
• Now you can think that's good or bad,
• but it's worth knowing it's a fact.
• The more important thing to think about
• is standard deviation all by itself is a meaningless number.
• You always have to think about it in the context of the mean.
• If I tell you the standard deviation is 100,
• you then say, well-- and I ask you whether it's big or small,
• you have no idea.
• If the mean is 100 and the standard deviation is 100,
• it's pretty big.
• If the mean is a billion and the standard deviation is 100,
• it's pretty small.
• So you should never want to look at just the standard deviation.
• All right, here is just some code
• to compute those, easy enough.
• Why am I doing this?
• Because we're now getting to the punch line.
• We often try and estimate values just by giving the mean.
• So we might report on an exam that the mean grade was 80.
• It's better instead of trying to describe
• an unknown value by it--
• an unknown parameter by a single value,
• say the expected return on betting a roulette wheel,
• to provide a confidence interval.
• So what a confidence interval is is
• a range that's likely to contain the unknown value,
• and a confidence that the unknown value is
• within that range.
• So I might say on a fair roulette
• wheel I expect that your return will be between minus 1%
• and plus 1%, and I expect that to be true 95% of the time
• you play the game if you play 100 rolls, spins.
• If you take 100 spins of the roulette wheel,
• I expect that 95% of the time your return
• will be between this and that.
• So here, we're saying the return on betting a pocket 10 times,
• 10,000 times in European roulette is minus 3.3%.
• I think that was the number we just saw.
• And now I'm going to add to that this margin of error,
• which is plus or minus 3.5% with a 95% level of confidence.
• What does this mean?
• If I were to conduct an infinite number of trials
• of 10,000 bets each, my expected average return
• would indeed be minus 3.3%, and it
• would be between these values 95% of the time.
• I've just subtracted and added this 3.5,
• saying nothing about what would happen
• in the other 5% of the time.
• How far away I might be from this,
• this is totally silent on that subject.
• Yes?
• AUDIENCE: I think you want 0.2 not 9.2.
• JOHN GUTTAG: Oh, let's see.
• Yep, I do.
• Thank you.
• We'll fix it on the spot.
• This is why you have to come to lecture
• rather than just reading the slides,
• because I make mistakes.
• Thank you, Eric.
• All right, so it's telling me that, and that's all it means.
• And it's amazing how often people don't quite
• know what this means.
• For example, when they look at a political pole
• and they see how many votes somebody is expected to get.
• And they see this confidence interval and say,
• what does that really mean?
• Most people don't know.
• But it does have a very precise meaning, and this is it.
• How do we compute confidence intervals?
• Most of the time we compute them using something
• called the empirical rule.
• Under some assumptions, which I'll get to a little bit later,
• the empirical rule says that if I take the data, find the mean,
• compute the standard deviation as we've just seen,
• 68% of the data will be within one standard deviation in front
• of or behind the mean.
• Within one standard deviation of the mean.
• 95% will be within 1.96 standard deviations.
• And that's what people usually use.
• Usually when people talk about confidence intervals,
• they're talking about the 95% confidence interval.
• And they use this 1.6 number.
• And 99.7% of the data will be within three
• standard deviations.
• So you can see if you are outside the third standard
• deviation, you are a pretty rare bird,
• for better or worse depending upon which side.
• All right, so let's apply the empirical rule
• to our roulette game.
• So I've got my three roulette games as before.
• I'm going to run a simple simulation.
• And the key thing to notice is really
• this print statement here.
• Right, that I'll print the mean, which I'm rounding.
• And then I'm going to give the confidence intervals,
• plus or minus, and I'll just take the standard deviation
• times 1.6 times 100, y times 100,
• because I'm showing you percentages.
• All right so again, very straightforward code.
• Just simulation, just like the ones we've been looking at.
• And well, I'm just going--
• I don't think I'll bother running it for you
• in the interest of time.
• You can run it yourself.
• But here's what I got when I ran it.
• So when I simulated betting a pocket for 20 trials,
• we see that the--
• of 1,000 spins each, for 1,000 spins
• the expected return for fair roulette happened to be 3.68%.
• A bit high.
• But you'll notice the confidence interval plus or minus
• 27 includes the actual answer, which is 0.
• And we have very large confidence intervals
• for the other two games.
• If you go way down to the bottom where I've spun, spun the wheel
• many more times, what we'll see is
• that my expected return for fair roulette is much closer to 0
• than it was here.
• But more importantly, my confidence interval
• is much smaller, 0.8.
• So now I really have constrained it pretty well.
• Similarly, for the other two games you will see--
• maybe it's more accurate, maybe it's less accurate,
• but importantly the confidence interval is smaller.
• So I have good reason to believe that the mean I'm computing
• is close to the true mean, because my confidence
• interval has shrunk.
• So that's the really important concept here,
• is that we don't just guess--
• compute the value in the simulation.
• We use, in this case, the empirical rule
• to tell us how much faith we should have in that value.
• All right, the empirical rule doesn't always work.
• There are a couple of assumptions.
• One is that the mean estimation error is 0.
• What is that saying?
• That I'm just as likely to guess high as gas low.
• In most experiments of this sort, most simulations,
• that's a very fair assumption.
• There's no reason to guess I'd be systematically off
• in one direction or another.
• It's different when you use this in a laboratory experiment,
• where in fact, depending upon your laboratory technique,
• there may be a bias in your results in one direction.
• So we have to assume that there's no bias in our errors.
• And we have to assume that the distribution of errors
• is normal.
• And we'll come back to this in just a second.
• But this is a normal distribution,
• called the Gaussian.
• Under those two assumptions the empirical rule
• will always hold.
• All right, let's talk about distributions,
• since I just introduced one.
• We've been using a probability distribution.
• And this captures the notion of the relative frequency
• with which some random variable takes on different values.
• There are two kinds. , Discrete and these when the values are
• drawn from a finite set of values.
• So when I flip these coins, there
• are only two possible values, head or tails.
• And so if we look at the distribution of heads
• and tails, it's pretty simple.
• We just list the probability of heads.
• We list the probability of tails.
• We know that those two probabilities must add up to 1,
• and that fully describes our distribution.
• Continuous random variables are a bit trickier.
• They're drawn from a set of reals between two numbers.
• For the sake of argument, let's say
• those two numbers are 0 and 1.
• Well, we can't just enumerate the probability
• for each number.
• How many real numbers are there between 0 and 1?
• An infinite number, right?
• And so I can't say, for each of these infinite numbers, what's
• the probability of it occurring?
• Actually the probability is close to 0 for each of them.
• Is 0, if they're truly infinite.
• So I need to do something else, and what
• I do that is what's called the probability density function.
• This is a different kind of PDF than the one Adobe sells.
• So there, we don't give the probability
• of the random variable taking on a specific value.
• We give the probability of it lying
• somewhere between two values.
• And then we define a curve, which shows how it works.
• So let's look at an example.
• So we'll go back to normal distributions.
• This is-- for the continuous normal distribution,
• it's described by this function.
• And for those of you who don't know about the magic number e,
• this is one of many ways to define it.
• But I really don't care whether you remember this.
• I don't care whether you know what e is.
• I don't care if you know what this is.
• What we really want to say is, it looks like this.
• In this case, the mean is 0.
• It doesn't have to be 0.
• I've [INAUDIBLE] a mean of 0 and a standard deviation of 1.
• This is called the so-called standard normal distribution.
• But it's symmetric around the mean.
• And that gets back to, it's equally likely
• that our errors are in either direction, right?
• So it peaks at the mean.
• The peak is always at the mean.
• That's the most probable value, and it's
• So if we look at it, for example, and I say,
• what's the probability of the number being between 0 and 1?
• I can look at it here and say, all right,
• let's draw a line here, and a line here.
• And then I can integrate the curve under here.
• And that tells me the probability
• of this random variable being between 0 and 1.
• If I want to know between minus 1 and 1.
• I just do this and then I integrate over that area.
• All right, so the area under the curve in this case
• defines the likelihood.
• Now I have to divide and normalize to actually get
• the answer between 0 and 1.
• So the question is, what fraction
• of the area under the curve is between minus 1 and 1?
• And that will tell me the probability.
• So what does the empirical rule tell us?
• What fraction is between minus 1 and 1, roughly?
• Yeah?
• 68%, right?
• So that tells me 68% of the area under this curve
• is between minus 1 and 1, because my standard deviation
• is 1, roughly 68%.
• And maybe your eyes will convince you
• that's a reasonable guess.
• OK, we'll come back and look at this in a bit more detail
• on Monday of next week.
• And also look at the question of,
• why does this work in so many cases
• where we don't actually have a normal distribution

### Description

MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
View the complete course: http://ocw.mit.edu/6-0002F16
Instructor: John Guttag

Prof. Guttag discusses the Monte Carlo simulation, Roulette