# 3. Insertion Sort, Merge Sort

452K+ views   |   2K+ likes   |   66 dislikes   |
Jan 14, 2013

### Transcription

• The following content is provided
• under a Creative Commons license.
• Your support will help MIT OpenCourseWare continue
• To make a donation or view additional materials
• from hundreds of MIT courses, visit MIT OpenCourseWare
• at ocw.mit.edu.
• PROFESSOR: So today's lecture is on sorting.
• We'll be talking about specific sorting algorithms today.
• I want to start by motivating why
• we're interested in sorting, which should be fairly easy.
• Then I want to discuss a particular sorting
• algorithm that's called insertion sort.
• That's probably the simplest sorting algorithm
• you can write, it's five lines of code.
• It's not the best sorting algorithm
• that's out there and so we'll try and improve it.
• We'll also talk about merge sort, which is a divide
• and conquer algorithm and that's going
• to motivate the last thing that I want to spend time on,
• which is recurrences and how you solve recurrences.
• Typically the recurrences that we'll
• be looking at in double o six are going to come from divide
• and conquer problems like merge sort
• but you'll see this over and over.
• So let's talk about why we're interested in sorting.
• There's some fairly obvious applications
• like if you want to maintain a phone book,
• you've got a bunch of names and numbers corresponding
• to a telephone directory and you want
• to keep them in sorted order so it's
• easy to search, mp3 organizers, spreadsheets, et cetera.
• So there's lots of obvious applications.
• There's also some interesting problems
• that become easy once items are sorted.
• One example of that is finding a median.
• So let's say that you have a bunch of items
• in an array a zero through n and a zero through n
• contains n numbers and they're not sorted.
• When you sort, you turn this into b 0
• through n, where if it's just numbers, then
• you may sort them in increasing order or decreasing order.
• Let's just call it increasing order for now.
• Or if they're records, and they're not numbers,
• then you have to provide a comparison function
• to determine which record is smaller than another record.
• And that's another input that you
• have to have in order to do the sorting.
• So it doesn't really matter what the items are
• as long as you have the comparison function.
• Think of it as less than or equal to.
• And if you have that and it's straightforward,
• obviously, to check that 3 is less than 4, et cetera.
• But it may be a little more complicated
• for more sophisticated sorting applications.
• But the bottom line is that if you have your algorithm that
• takes a comparison function as an input,
• you're going to be able to, after a certain amount of time,
• get B 0 n.
• Now if you wanted to find the median of the set of numbers
• that were originally in the array A,
• what would you do once you have the sorted array B?
• AUDIENCE: Isn't there a more efficient algorithm for median?
• PROFESSOR: Absolutely.
• But this is sort of a side effect of having a sorted list.
• If you happen to have a sorted list,
• there's many ways that you could imagine
• building up a sorted list.
• One way is you have something that's completely unsorted
• and you run insertion sort or merge sort.
• Another way would be to maintain a sorted list as you're
• getting items put into the list.
• So if you happened to have a sorted list
• and you need to have this sorted list for some reason,
• the point I'm making here is that finding the median
• is easy.
• And it's easy because all you have to do
• is look at-- depending on whether n is odd
• or even-- look at B of n over 2.
• That would give you the median because you'd
• have a bunch of numbers that are less than that
• and the equal set of numbers that are greater than that,
• which is the definition of median.
• So this is not necessarily the best way, as you pointed out,
• of finding the median.
• But it's constant time if you have a sorted list.
• That's the point I wanted to make.
• There are other things that you could do.
• And this came up in Erik's lecture,
• which is the notion of binary search-- finding
• an element in an array-- a specific element.
• You have a list of items-- again a 0 through n.
• And you're looking for a specific number or item.
• You could, obviously, scan the array,
• and that would take you linear time to find this item.
• If the array happened to be sorted,
• then you can find this in logarithmic time
• using what's called binary search.
• Let's say you're looking for a specific item.
• Let's call it k.
• Binary search, roughly speaking, would
• work like-- you go compare k to, again, B of n over 2,
• and decide, given that B is sorted,
• you get to look at 1/2 of the array.
• If B of n over 2 is not exactly k, then-- well,
• if it's exactly k you're done.
• Otherwise, you look at the left half.
• And you can do this in logarithmic time.
• So keep this in mind, because binary search
• is going to come up in today's lecture
• and again in other lectures.
• It's really a great paradigm of divide
• and conquer-- probably the simplest.
• And it, essentially, takes something
• that's linear-- a linear search--
• and turns it into logarithmic search.
• So those are a couple of problems
• that become easy if you have a sorted list.
• And there's some not so obvious applications
• of sorting-- for example, data compression.
• If you wanted to compress a file,
• one of the things that you could do is to--
• and it's a set of items-- you could sort the items.
• And that automatically finds duplicates.
• And you could say, if I have 100 items that are all identical,
• I'm going to compress the file by representing the item once
• and, then, having a number associated
• with the frequency of that item-- similar to what
• document distance does.
• Document distance can be viewed as a way
• of compressing your initial input.
• Obviously, you lose the works of Shakespeare or whatever it was.
• And it becomes a bunch of words and frequencies.
• But it is something that compresses the input
• and gives you a different representation.
• And so people use sorting as a subroutine in data compression.
• Computer graphics uses sorting.
• Most of the time, when you render
• scenes in computer graphics, you have many layers
• corresponding to the scenes.
• It turns out that, in computer graphics,
• most of the time you're actually rendering
• front to back because, when you have a big opaque
• object in front, you want to render that first,
• so you don't have to worry about everything that's
• occluded by this big opaque object.
• And that makes things more efficient.
• And so you keep things sorted front to back,
• most of the time, in computer graphics rendering.
• But some of the time, if you're worried about transparency,
• you have to render things back to front.
• So typically, you have sorted lists
• corresponding to the different objects in both orders--
• both increasing order and decreasing order.
• And you're maintaining that.
• So sorting is a real important subroutine
• in pretty much any sophisticated application you look at.
• So it's worthwhile to look at the variety of sorting
• algorithms that are out there.
• And we're going to do some simple ones, today.
• But if you go and look at Wikipedia
• and do a Google search, there's all sorts
• of sorts like cocktail sort, and bitonic sort,
• and what have you.
• And there's reasons why each of these sorting algorithms exist.
• Because in specific cases, they end up
• winning on types of inputs or types of problems.
• So let's take a look at our first sorting algorithm.
• I'm not going to write code but it will be in the notes.
• And it is in your document distance Python files.
• But I'll just give you pseudocode here
• and walk through what insertion sort looks like
• because the purpose of describing
• this algorithm to you is to analyze its complexity.
• We need to do some counting here,
• with respect to this algorithm, to figure out
• how fast it's going to run in and what the worst case
• complexity is.
• So what is insertion sort?
• For i equals 1, 2, through n, given an input to be sorted,
• what we're going to do is we're going to insert A of i
• in the right position.
• And we're going to assume that we
• are sort of midway through the sorting process, where
• we have sorted A 0 through i minus 1.
• And we're going to expand this to this array
• to have i plus 1 elements.
• And A of i is going to get inserted
• into the correct position.
• And we're going to do this by pairwise swaps
• down to the correct position for the number that is initially
• in A of i.
• So let's go through an example of this.
• We're going to sort in increasing order.
• Just have six numbers.
• And initially, we have 5, 2, 4, 6, 1, 3.
• And we're going to take a look at this.
• And you start with the index 1, or the second element,
• because the very first element-- it's a single element
• and it's already sorted by definition.
• But you start from here.
• And this is what we call our key.
• And that's essentially a pointer to where we're at, right now.
• And the key keeps moving to the right
• as we go through the different steps of the algorithm.
• And so what you do is you look at this
• and you have-- this is A of i.
• And you have A of 0 to 0, which is 5.
• And since we want to sort in increasing order,
• this is not sorted.
• And so we do a swap.
• So what this would do in this step is to do a swap.
• And we would go obtain 2, 5, 4, 6, 1, 3.
• So all that's happened here, in this step-- in the very
• first step where the key is in the second position--
• is one swap happened.
• Now, your key is here, at item 4.
• Again, you need to put 4 into the right spot.
• And so you do pairwise swaps.
• And in this case, you have to do one swap.
• And you get 2, 4, 5.
• And you're done with this iteration.
• So what happens here is you have 2, 4, 5, 6, 1, 3.
• And now, the key is over here, at 6.
• Now, at this point, things are kind of easy,
• in the sense that you look at it and you say, well, I
• know this part is already started.
• 6 is greater than 5.
• So you have to do nothing.
• So there's no swaps that happen in this step.
• So all that happens here is you're
• going to move the key to one step to the right.
• So you have 2, 4, 5, 6, 1, 3.
• And your key is now at 1.
• Here, you have to do more work.
• Now, you see one aspect of the complexity of this algorithm--
• given that you're doing pairwise swaps-- the way
• this algorithm was defined, in pseudocode, out there, was I'm
• going to use pairwise swaps to find the correct position.
• So what you're going to do is you're
• going to have to swap first 1 and 6.
• And then you'll swap-- 1 is over here.
• So you'll swap this position and that position.
• And then you'll swap-- essentially,
• do 4 swaps to get to the point where you have
• 1, 2, 4, 5, 6, 3.
• So this is the result.
• 1, 2, 4, 5, 6, 3.
• And the important thing to understand, here,
• is that you've done four swaps to get 1
• to the correct position.
• Now, you could imagine a different data structure
• where you move this over there and you shift them
• all to the right.
• But in fact, that shifting of these four elements
• is going to be computed in our model as four
• operations, or four steps, anyway.
• So there's no getting away from the fact
• that you have to do four things here.
• And the way the code that we have for insertion sort
• does this is by using pairwise swaps.
• So we're almost done.
• Now, we have the key at 3.
• And now, 3 needs to get put into the correct position.
• And so you've got to do a few swaps.
• This is the last step.
• And what happens here is 3 is going to get swapped with 6.
• And then 3 needs to get swapped with 5.
• And then 3 needs to get swapped with 4.
• And then, since 3 is greater than 2, you're done.
• So you have 1, 2, 3, 4, 5, 6.
• And that's it.
• So, analysis.
• How many steps do I have?
• AUDIENCE: n squared?
• PROFESSOR: No, how many steps do I have?
• I guess that wasn't a good question.
• If I think of a step as being a movement of the key,
• how many steps do I have?
• I have theta n steps.
• And in this case, you can think of it as n minus 1 steps,
• since you started with 2.
• But let's just call it theta n steps,
• in terms of key positions.
• And you're right.
• It is n square because, at any given step,
• it's quite possible that I have to do theta n work.
• And one example is this one, right here,
• where I had to do four swaps.
• And in general, you can construct a scenario
• where, towards the end of the algorithm,
• you'd have to do theta n work.
• But if you had a list that was reverse sorted.
• You would, essentially, have to do, on an average n
• by two swaps as you go through each of the steps.
• And that's theta n.
• So each step is theta n swaps.
• And when I say swaps, I could also
• say each step is theta n compares and swaps.
• And this is going to be important
• because I'm going to ask you an interesting question
• in a minute.
• But let me summarize.
• What I have here is a theta n squared algorithm.
• The reason this is a theta n squared
• algorithm is because I have theta n steps
• and each step is theta n.
• When I'm counting, what am I counting
• it terms of operations?
• The assumption here-- unspoken assumption--
• has been that an operation is a compare and a swap
• and they're, essentially, equal in cost.
• And in most computers, that's true.
• You have a single instruction and, say, the x86
• or the MIPS architecture that can do a compare,
• and the same thing for swapping registers.
• So perfectly reasonably assumption
• that compares and swaps for numbers
• have exactly the same cost.
• But if you had a record and you were comparing records,
• and the comparison function that you used for the records was
• in itself a method call or a subroutine,
• it's quite possible that all you're doing
• is swapping pointers or references to do the swap,
• but the comparison could be substantially more expensive.
• Most of the time-- and we'll differentiate
• if it becomes necessary-- we're going
• to be counting comparisons in the sorting algorithms
• that we'll be putting out.
• And we'll be assuming that either comparison swaps are
• roughly the same or that compares are--
• and we'll say which one, of course-- that compares
• are substantially more expensive than swaps.
• So if you had either of those cases for insertion sort,
• you have a theta n squared algorithm.
• You have theta n squared compares
• and theta n squared swaps.
• Now, here's a question.
• Let's say that compares are more expensive than swaps.
• And so, I'm concerned about the theta
• n squared comparison cost.
• I'm not as concerned, because of the constant factors involved,
• with the theta n squared swap cost.
• This is a question question.
• What's a simple fix-- change to this algorithm that
• would give me a better complexity in the case
• where compares are more expensive,
• or I'm only looking at the complexity of compares.
• So the theta whatever of compares.
• Anyone?
• Yeah, back there.
• AUDIENCE: [INAUDIBLE]
• PROFESSOR: You could compare with the middle.
• What did I call it?
• I called it something.
• What you just said, I called it something.
• AUDIENCE: Binary search.
• PROFESSOR: Binary search.
• That's right.
• Two cushions for this one.
• So you pick them up after lecture.
• So you're exactly right.
• You got it right.
• I called it binary search, up here.
• And so you can take insertion sort
• and you can sort of trivially turn it into a theta n log n
• algorithm if we are talking about n
• being the number of compares.
• And all you have to do to do that is to say,
• you know what, I'm going to replace
• this with binary search.
• And you can do that-- and that was the key observation--
• because A of 0 through i minus 1 is already sorted.
• And so you can do binary search on that part of the array.
• So let me just write that down.
• Do a binary search on A of 0 through i minus 1,
• And essentially, you can think of it as theta log i time,
• and for each of those steps.
• And so then you get your theta n log n theta n log
• n in terms of compares.
• Does this help the swaps for an array data structure?
• No, because binary search will require insertion
• into A of 0 though i minus 1.
• So here's the problem.
• Why don't we have a full-fledged theta n log n algorithm,
• regardless of the cost of compares or swaps?
• We don't quite have that.
• We don't quite have that because we need to insert our A of i
• into the right position into A of 0 through i minus 1.
• You do that if you have an array structure,
• it might get into the middle.
• And you have to shift things over to the right.
• And when you shift things over to the right,
• in the worst case, you may be shifting a lot of things
• over to the right.
• And that gets back to worst case complexity of theta n.
• So a binary search in insertion sort
• gives you theta n log n for compares.
• But it's still theta n squared for swaps.
• So as you can see, there's many varieties
• of sorting algorithms.
• We just looked at a couple of them.
• And they were both insertion sort.
• The second one that I just put up
• is, I guess, technically called binary insertion sort
• because it does binary search.
• And the vanilla insertion sort is
• the one that you have the code for in the doc dis program,
• or at least one of the doc dis files.
• So let's move on and talk about a different algorithm.
• So what we'd like to do, now-- this class
• We're never happy.
• We always want to do a little bit better.
• And eventually, once we run out of room
• from an asymptotic standpoint, you
• take these other classes where you try and improve
• constant factors and get 10%, and 5%, and 1%,
• and so on, and so forth.
• But we'll stick to improving asymptotic complexity.
• And we're not quite happy with binary insertion sort
• because, in the case of numbers, our binary insertion sort
• has theta n squared complexity, if you look at swaps.
• So we'd like to go find an algorithm that is theta n log
• n.
• And I guess, eventually, we'll have to stop.
• But Erik will take care of that.
• There's a reason to stop.
• It's when you can prove that you can't do any better.
• And so we'll get to that, eventually.
• So merge sort is also something that you've probably seen.
• But there probably will be a couple
• of subtleties that come out as I describe this algorithm that,
• hopefully, will be interesting to those of you who already
• know merge sort.
• And for those of you who don't, it's a very pretty algorithm.
• It's a standard recursion algorithm-- recursive
• algorithm-- similar to a binary search.
• What we do, here, is we have an array, A. We split it
• into two parts, L and R. And essentially, we kind of
• do no work, really.
• In terms of the L and R in the sense that we just call,
• we keep splitting, splitting, splitting.
• And all the work is done down at the bottom
• in this routine called merge, where we are merging
• a pair of elements at the leaves.
• And then, we merge two pairs and get four elements.
• And then we merge four tuples of elements, et cetera,
• and go all the way up.
• So while I'm just saying L terms into L prime, out here,
• there's no real explicit code that you
• can see that turns L into L prime.
• It happens really later.
• There's no real sorting code, here.
• It happens in the merge routine.
• And you'll see that quite clearly
• when we run through an example.
• So you have L and R turn into L prime and R prime.
• And what we end up getting is a sorted array, A.
• And we have what's called a merge routine that
• takes L prime and R prime and merges them
• into the sorted array.
• So at the top level, what you see is split into two,
• and do a merge, and get to the sorted array.
• The input is of size n.
• You have two arrays of size n over 2.
• These are two sorted arrays of size n over 2.
• And then, finally, you have a sorted array of size n.
• So if you want to follow the recursive of execution
• of this in a small example, then you'll
• be able to see how this works.
• And we'll do a fairly straightforward example
• with 8 elements.
• So at the top level-- before we get there, merge
• is going to assume that you have two sorted arrays,
• and merge them together.
• That's the invariant in merge sort, or for the merge routine.
• It assumes the inputs are sorted-- L and R. Actually
• I should say, L prime and R prime.
• So let's say you have 20, 13, 7, and 2.
• You have 12, 11, 9, and 1.
• And this could be L prime.
• And this could be R prime.
• What you have is what we call a two finger algorithm.
• And so you've got two fingers and each of them
• point to something.
• And in this case, one of them is pointing
• to L. My left finger is pointing to L prime,
• or some element L prime.
• My right finger is pointing to some element in R prime.
• And I'm going to compare the two elements
• that my fingers are pointing to.
• And I'm going to choose, in this case,
• the smaller of those elements.
• And I'm going to put them into the sorted array.
• So start out here.
• Look at that and that.
• And I compared 2 and 1.
• And which is smaller?
• 1 is smaller.
• So I'm going to write 1 down.
• This is a two finger algo for merge.
• And I put 1 down.
• When I put 1 down, I had to cross out 1.
• So effectively, what happens is-- let
• me just circle that instead of crossing it out.
• And my finger moves up to 9.
• So now I'm pointing at 2 and 9.
• And I repeat this step.
• So now, in this case, 2 is smaller.
• So I'm going to go ahead and write 2 down.
• And I can cross out 2 and move my finger up to 7.
• And so that's it.
• I won't bore you with the rest of the steps.
• It's essentially walking up.
• You have a couple of pointers and you're
• walking up these two arrays.
• And you're writing down 1, 2, 7, 9, 11, 12, 13, 20.
• And that's your merge routine.
• And all of the work, really, is done in the merge routine
• because, other than that, the body is simply
• a recursive call.
• You have to, obviously, split the array.
• But that's fairly straightforward.
• If you have an array, A 0 through n-- and depending on
• whether n is odd or even-- you could
• imagine that you set L to be A 0 n by 2 minus 1,
• and R similarly.
• And so you just split it halfway in the middle.
• I'll talk about that a little bit more.
• There's a subtlety associated with that
• that we'll get to in a few minutes.
• But to finish up in terms of the computation of merge sort.
• This is it.
• The merge routine is doing most, if not all, of the work.
• And this two finger algorithm is going
• to be able to take two sorted arrays
• and put them into a single sorted array
• by interspersing, or interleaving, these elements.
• And what's the complexity of merge
• if I have two arrays of size n over 2, here?
• What do I have?
• AUDIENCE: n.
• PROFESSOR: n.
• We'll give you a cushion, too.
• theta n complexity.
• So far so good.
• I know you know the answer as to what
• the complexity of merge sort is.
• But I'm guessing that most of you
• won't be able to prove it to me because I'm kind of a hard guy
• to prove something to.
• And I could always say, no, I don't believe you
• or I don't understand.
• The complexity-- and you've said this before, in class,
• and I think Erik's mentioned it--
• the overall complexity of this algorithm is theta n log n
• And where does that come from?
• How do you prove that?
• And so what we'll do, now, is take a look at merge sort.
• And we'll look at the recursion tree.
• And we'll try and-- there are many ways
• of proving that merge sort is theta n log n.
• The way we're going to do this is
• what's called proof by picture.
• And it's not an established proof technique,
• but it's something that is very helpful
• to get an intuition behind the proof
• and why the result is true.
• And you can always take that and you
• can formalize it and make this something
• that everyone believes.
• And we'll also look at substitution, possibly
• in section tomorrow, for recurrence solving.
• So where we're right now is that we have a divide and conquer
• algorithm that has a merge step that is theta n.
• And so, if I just look at this structure that I have here,
• I can write a recurrence for merge sort
• that looks like this.
• So when I say complexity, I can say
• T of n, which is the work done for n items,
• is going to be some constant time in order
• to divide the array.
• So this could be the part corresponding
• to dividing the array.
• And there's going to be two problems of size n over 2.
• And so I have 2 T of n over 2.
• And this is the recursive part.
• And I'm going to have c times n, which is the merge part.
• And that's some constant times n, which is what we have,
• here, with respect to the theta n complexity.
• So you have a recurrence like this and I know some of you
• have seen recurrences in 6.042.
• And you know how to solve this.
• What I'd like to do is show you this recursion tree expansion
• that, not only tells you how to solve this occurrence,
• but also gives you a means of solving recurrences where,
• instead of having c of n, you have something else out here.
• You have f of n, which is a different function
• from the linear function.
• And this recursion tree is, in my mind,
• the simplest way of arguing the theta n log n
• complexity of merge sort.
• So what I want to do is expand this recurrence out.
• And let's do that over here.
• So I have c of n on top.
• I'm going to ignore this constant factor because c of n
• dominates.
• I want to break things up, as I do the recursion.
• So when I go c of n, at the top level-- that's
• the work I have to do at the merge, at the top level.
• And then when I go down to two smaller problems, each of them
• is size n over 2.
• So I do c times n divided by 2 [INAUDIBLE].
• So this is just a constant c.
• I didn't want to write thetas up here.
• You could.
• And I'll say a little bit more about that later.
• But think of this cn as representing the theta n
• complexity.
• And c is this constant.
• So c times n, here. c times n over 2, here.
• And then when I keep going, I have c times n over 4,
• c times n over 4, et cetera, and so on, and so forth.
• And when I come down all the way here,
• n is eventually going to become 1-- or essentially a constant--
• and I'm going to have a bunch of c's here.
• So here's another question, that I'd like you to answer.
• Someone tell me what the number of levels in this tree are,
• precisely, and the number of leaves in this tree are,
• precisely.
• AUDIENCE: The number of levels is log n plus 1.
• PROFESSOR: Log n plus 1.
• Log to the base 2 plus 1.
• And the number of leaves?
• You raised your hand back there, first.
• Number of leaves.
• AUDIENCE: I think n.
• PROFESSOR: Yeah, you're right.
• You think right.
• So 1 plus log n and n leaves.
• When n becomes 1, how many of them do you have?
• You're down to a single element, which is, by definition,
• sorted.
• And you have n leaves.
• So now let's add up the work.
• I really like this picture because it's just
• so intuitive in terms of getting us the result
• that we're looking for.
• So you add up the work in each of the levels of this tree.
• So the top level is cn.
• The second level is cn because I added 1/2 and 1/2, cn, cn.
• Wow.
• What symmetry.
• So you're doing the same amount of work modulo
• the constant factors, here, with what's
• going on with the c1, which we've ignored,
• but roughly the same amount of work in each of the levels.
• And now, you know how many levels there are.
• It's 1 plus log n.
• So if you want to write an equation for T of n,
• it's 1 plus log n times c of n, which is theta of n log n.
• So I've mixed in constants c and thetas.
• For the purposes of this description,
• they're interchangeable.
• You will see recurrences that look like this, in class.
• And things like that.
• Don't get confused.
• It's just a constant multiplicative factor
• in front of the function that you have.
• And it's just a little easier, I think,
• to write down these constant factors
• and realize that the amount of work done
• is the same in each of the leaves.
• And once you know the dimensions of this tree,
• in terms of levels and in terms of the number of leaves,
• So we've looked at two algorithm, so far.
• And insertion sort, if you talk about numbers,
• is theta n squared for swaps.
• Merge sort is theta n log n.
• Here's another interesting question.
• What is one advantage of insertion sort over merge sort?
• AUDIENCE: [INAUDIBLE]
• PROFESSOR: What does that mean?
• AUDIENCE: You don't have to move elements outside
• of [INAUDIBLE].
• PROFESSOR: That's exactly right.
• That's exactly right.
• So the two guys who answered the questions
• before with the levels, and you.
• Come to me after class.
• So that's a great answer.
• It's in-place sorting is something
• that has to do with auxiliary space.
• And so what you see, here-- and it was a bit hidden, here.
• But the fact of the matter is that you
• had L prime and R prime.
• And L prime and R prime are different from L and R, which
• were the initial halves of the inputs to the sorting
• algorithm.
• And what I said here is, we're going to dump this into A.
• That's what this picture shows.
• This says sorted array, A. And so you
• had to make a copy of the array-- the two halves L
• and R-- in order to do the recursion,
• and then to take the results and put them
• into the sorted array, A.
• So you needed-- in merge sort-- you
• needed theta n auxiliary space.
• So merge sort, you need theta n extra space.
• And the definition of in-place sorting
• implies that you have theta 1-- constant-- auxiliary space.
• The auxiliary space for insertion sort
• is simply that temporary variable
• that you need when you swap two elements.
• So when you want to swap a couple of registers,
• you gotta store one of the values in a temporary location,
• override the other, et cetera.
• And that's the theta 1 auxiliary space for insertion sort.
• So there is an advantage of the version of insertion sort
• we've talked about, today, over merge sort.
• And if you have a billion elements, that's potentially
• something you don't want to store in memory.
• If you want to do something really fast and do everything
• in cache or main memory, and you want
• to sort billions are maybe even trillions of items,
• this becomes an important consideration.
• I will say that you can reduce the constant factor
• of the theta n.
• So in the vanilla scheme, you could
• imagine that you have to have a copy of the array.
• So if you had n elements, you essentially
• have n extra items of storage.
• You can make that n over 2 with a simple coding trick
• by keeping 1/2 of A.
• You can throw away one of the L's or one of the R's.
• And you can get it down to n over 2.
• And that turns out-- it's a reasonable thing
• to do if you have a billion elements
• and you want to reduce your storage by a constant factor.
• So that's one coding trick.
• Now it turns out that you can actually go further.
• And there's a fairly sophisticated algorithm
• that's sort of beyond the scope of 6.006
• that's an in-place merge sort.
• And this in-place merge sort is kind of
• impractical in the sense that it doesn't do very well
• in terms of the constant factors.
• So while it's in-place and it's still theta n log n.
• The problem is that the running time of an in-place merge sort
• is much worse than the regular merge sort that
• uses theta n auxiliary space.
• So people don't really use in-place merge sort.
• It's a great paper.
• It's a great thing to read.
• Its analysis is a bit sophisticated for double 0 6.
• So we wont go there.
• But it does exist.
• So you can take merge sort, and I just
• want to let you know that you can do things in-place.
• In terms of numbers, some experiments we ran a few years
• ago-- so these may not be completely valid
• because I'm going to actually give you numbers--
• but merge sort in Python, if you write a little curve fit
• program to do this, is 2.2n log n microseconds for a given n.
• So this is the merge sort routine.
• And if you look at insertion sort, in Python,
• that's something like 0.2 n square microseconds.
• So you see the constant factors here.
• If you do insertion sort in C, which is a compiled language,
• then, it's much faster.
• It's about 20 times faster.
• It's 0.01 n squared microseconds.
• So a little bit of practice on the side.
• We do ask you to write code.
• And this is important.
• The reason we're interested in algorithms
• is because people want to run them.
• And what you can see is that you can actually find an n-- so
• regardless of whether you're Python or C,
• this tells you that asymptotic complexity is pretty important
• because, once n gets beyond about 4,000,
• you're going to see that merge sort in Python
• beats insertion sort in C.
• So the constant factors get subsumed
• beyond certain values of n.
• So that's why asymptotic complexity is important.
• You do have a factor of 20, here,
• of keeping an n square algorithm competitive.
• It stays competitive for a little bit longer,
• but then falls behind.
• That's what I wanted to cover for sorting.
• So hopefully, you have a sense of what
• happens with these two sorting algorithms.
• We'll look at a very different sorting algorithm next time,
• using heaps, which is a different data structure.
• The last thing I want to do in the couple minutes I have left
• is give you a little more intuition as to recurrence
• solving based on this diagram that I wrote up there.
• And so we're going to use exactly this structure.
• And we're going to look at a couple of different recurrences
• that I won't really motivate in terms
• of having a specific algorithm, but I'll just
• write out the recurrence.
• And we'll look at the recursion tree for that.
• And I'll try and tease out of you the complexity associated
• with these recurrences of the overall complexity.
• So let's take a look at T of n equals 2 T of n over 2
• plus c n squared.
• Let me just call that c-- no need for the brackets.
• So constant c times n squared.
• So if you had a crummy merge routine,
• and it was taking n square, and you coded it up wrong.
• It's not a great motivation for this recurrence,
• but it's a way this recurrence could have come up.
• So what does this recursive tree look like?
• Well it looks kind of the same, obviously.
• You have c n square; you have c n square divided by 4;
• c n square divided by 4; c n square divided
• by 16, four times.
• Looking a little bit different from the other one.
• The levels and the leaves are exactly the same.
• Eventually n is going to go down to 1.
• So you will see c all the way here.
• And you're going to have n leaves.
• And you will have, as before, 1 plus log n levels.
• Everything is the same.
• And this is why I like this recursive tree formulation so
• much because, now, all I have to do
• is add up the work associated with each of the levels
• to get the solution to the recurrence.
• Now, take a look at what happens, here.
• c n square; c n square divided by 2; c n square divided by 4.
• And this is n times c.
• So what does that add up to?
• AUDIENCE: [INAUDIBLE]
• PROFESSOR: Yeah, exactly.
• Exactly right.
• So if you look at what happens, here, this dominates.
• All of the other things are actually less than that.
• And you said bounded by two c n square
• because this part is bounded by c n square
• and I already have c n square up at the top.
• So this particular algorithm that corresponds to this crummy
• merge sort, or wherever this recurrence came from,
• is a theta n squared algorithm.
• And in this case, all of the work done
• is at the root-- at the top level of the recursion.
• Here, there was a roughly equal amount
• of work done in each of the different levels.
• Here, all of the work was done at the root.
• And so to close up shop, here, let
• me just give you real quick a recurrence where
• all of the work is done at the leaves, just for closure.
• So if I had, magically, a merge routine that actually happened
• in constant time, either through buggy analysis,
• or because of it was buggy, then what
• does the tree look like for that?
• And I can think of this as being theta 1.
• Or I can think of this as being just a constant c.
• I'll stick with that.
• So I have c, c, c.
• Woah, I tried to move that up.
• That doesn't work.
• So I have n leaves, as before.
• And so if I look at what I have, here, I
• have c at the top level.
• I have 2c, and so on and so forth.
• 4c.
• And then I go all the way down to nc.
• And so what happens here is this dominates.
• And so, in this recurrence, the whole thing runs in theta n.
• So the solution to that is theta n.
• And what you have here is all of the work
• being done at the leaves.
• We're not going to really cover this theorem that gives you
• a mechanical way of figuring this out because we think
• the recursive tree is a better way of looking at.
• But you can see that, depending on what that function is,
• in terms of the work being done in the merge routine,
• you'd have different versions of recurrences.
• pick up you cushions.
• See you next time.

### Description

MIT 6.006 Introduction to Algorithms, Fall 2011
View the complete course: http://ocw.mit.edu/6-006F11

More courses at http://ocw.mit.edu

### Related videos

00:00
708K+ views | Sep 09, 2016
00:00
2M+ views | Jan 14, 2013
00:00
545K+ views | Jan 16, 2014
11:27
1M+ views | Jun 14, 2017
52:40
Jul 18, 2019

20:26
106K+ views | Sep 24, 2018

### Suggested videos

52:32
485K+ views | Jan 14, 2013
15:28
591K+ views | Jul 14, 2015
00:00
1M+ views | Oct 10, 2017
00:00
3M+ views | Sep 11, 2013
09:53
Jul 23, 2019

08:09
20M+ views | Nov 09, 2018
00:00
Jan 02, 2020

### Trending videos

00:00
2M+ views | Aug 11, 2019
00:00
Jan 24, 2020 at 03:00