P vs. NP and the Computational Complexity Zoo

2M+ views   |   50K+ likes   |   644 dislikes   |  
Aug 26, 2014


P vs. NP and the Computational Complexity Zoo
P vs. NP and the Computational Complexity Zoo thumb P vs. NP and the Computational Complexity Zoo thumb P vs. NP and the Computational Complexity Zoo thumb


  • Today I wanna talk about the deepest unanswered question in computer science,
  • and, maybe in all of math.
  • A problem called P vs. NP.
  • And I wanna talk about how ideas from computing show up in the everyday world around us,
  • and how problems as seemingly disparate as protein folding and making up crossword puzzles
  • share a common core difficulty that turns out to be a lot like Sudoku,
  • well,
  • okay basically it is Sudoku.
  • In 2000 the Clay institute offered 1 Million dollar each for the solutions to seven key problems in Math --
  • The Millennium Prize problems.
  • These are profound and difficult problems and for most of them it takes a lot of specialized knowledge
  • to even understand the question.
  • Of the seven problems P vs. NP was both the most recently conceived -- in 1971,
  • and by far the easiest one to understand and explain.
  • And in March 2010 the Clay Institute awarded the first of its seven prizes for the solution to,
  • yeah, not P vs. NP.
  • So, what is the P vs. NP question?
  • Well, back in the 1970s computer scientists were busily figuring out how to program their
  • retro-fabulous, cabinet-sized computers to solve all the world's problems.
  • Sometimes the first program anyone could think of for a particular problem would be unworkably slow
  • but then, over time people would come up with clever ways to make it faster,
  • or at least, that happened for some problems.
  • For others, nobody was coming up with faster programs.
  • To get a handle on the situation they started sorting the problems into classes based on
  • how fast the program can solve them.
  • For problems like multiplication they had really fast programs,
  • and for others, like playing absolutely perfect chess they figured out that there just was no fast program.
  • But for a bunch in-between they weren't sure whether there was a fast way to do it.
  • So, they kept trying.
  • This is where P and NP come in.
  • Skipping a ton of details for a second, P is a class that basically includes all the problems that can be solved by
  • a reasonably fast program, like multiplication or alphabetizing a list of names.
  • And then, around and including P we sort of discovered a class called NP
  • That's all the problems where, if you're given a correct solution you can at least check it in a reasonable amount of time.
  • NP was totally maddening, because it contained lots of important problems like
  • vehicle routing and scheduling and circuit design and databases.
  • And often, we get lucky and find that an NP problem was actually a part of P
  • and we'd have our fast program.
  • But, for a lot of them that didn't seem to be happening.
  • So people started to wonder whether everything in NP would turn out to be in P
  • or if there were some NP problems that were truly harder than the ones in P.
  • That's the P vs. NP question.
  • If all the NP problems are really in P, a lot of important puzzles we've been struggling with
  • are gonna turn out to be easy for computers to solve.
  • Puzzles connected to biology, and curing cancer;
  • Puzzles in business and economics.
  • We'd have a lot of miracle answers almost overnight.
  • And also the encryption we use for things like online banking would be easy to crack,
  • because it's based on NP problems.
  • Okay, examples:
  • I like to think of the problems in NP as being basically like "puzzles"
  • because I think what makes a puzzle a puzzle is that it's a problem where you can give away the answer,
  • and that's what NP means.
  • Like with Sudoku.
  • Sudoku puzzles can take a long time to solve but if I give you a solved Sudoku grid
  • checking it for mistakes is pretty quick.
  • Outside of NP there are problems where it's hard to even check an answer.
  • Like, what's the best move to make in this chess game?
  • I could tell you the answer but how would you know whether I'm right?
  • Well, you wouldn't, because finding out requires a calculation so enormous that there's a pretty good argument
  • we'll never be able to build a computer that could do it.
  • To me, that's not a very good puzzle. It's practically impossible to know whether you've solved it.
  • On the other side, are all the reasonable, solvable puzzles in P.
  • These are clearly also in NP because one way to check an answer is to go through the process of finding it yourself.
  • Like, if I were to tell you that the answer to 51 x 3 is 153,
  • how would you check whether I'm right?
  • You'd probably just multiply 51 by 3 yourself because it's fast to do it,
  • but Sudoku is different, or at least, we think it is.
  • It seems like solving a Sudoku grid is a lot harder than checking a solution,
  • but in fact nobody's been able to prove it yet.
  • As far as we know, there could be a clever way of playing Sudoku, much much faster.
  • So that's the question:
  • Does being able to quickly recognize correct answers mean there's also a quick way to find them?
  • Nobody knows for sure, but either way, figuring out exactly how this works would teach us
  • something important about the nature of computation.
  • It gets weirder from here but first: three important details.
  • 1. You might be thinking: hey, Sudoku is tough and all but it's not that hard. What's the big deal?
  • Well, we're really talking about how the difficulty scales up as you make the problem bigger and bigger.
  • Like, how much harder is a 100x100 Sudoku grid than a standard 9x9 grid?
  • We've been making computers exponentially faster as time goes on,
  • so, for problems that don't get exponentially harder as they get bigger all we have to do is wait for computers
  • to get more powerful and then, even huge versions of those problems will be easy to solve by a computer.
  • Like, multiplication problems are pretty easy for computers, even with enormous numbers.
  • As the numbers get bigger, multiplying them just doesn't get harder very fast.
  • These days, the phone is what would have been referred to in the 1970s as a "supercomputer".
  • And you'd have to make up truly huge multiplication problems to stand up to all the computational power we've got now.
  • Lots of familiar puzzles like mazes and Rubik's cubes are in the same camp.
  • Hard for people, but easy work for computers.
  • And then there's Sudoku. Computers can usually solve a 9x9 grid in a few milliseconds
  • even though humans find them challenging.
  • But as you make the grid bigger the problem just gets really hard,
  • rapidly getting out of reach for even the most powerful computers.
  • 2. P stands for "Polynomial time". In P the number of steps you have to do to solve a problem,
  • and therefore the amount of time that it takes, is some polynomial function of its size.
  • "Polynomial" is a mishmash of Greek and Latin meaning "many names", which is,
  • regrettably a pretty typical example of Math's flair for unhelpful terminology.
  • Anyway, polynomials are functions involving n or n² or n to other powers like these,
  • but importantly they're not exponential functions like 2ⁿ, which gets to be a ton of steps really fast as n goes up,
  • a lot quicker than n².
  • So that's P, it's problems like mazes and multiplication where the number of steps required isn't that bad
  • compared to the size of the problem.
  • NP is all about polynomial time *checking*.
  • NP stands for "Non deterministic polynomial time",
  • which, being math terminology is almost a mean spirited way of saying
  • that if you had a bajillion computers then you could check all possible answers at the same time.
  • You could find a correct solution in polynomial time.
  • 2.5. We're actually talking about the number of steps required to solve a problem in the worst case scenario.
  • Like, how many steps does it take to unscramble a Rubik's cube?
  • Well, when it's scrambled like this, it takes one step, but it can get a lot worse.
  • Similarly, some 9x9 Sudokus are harder than others.
  • People also look at things like the best case and the average case, but worst case analysis
  • is what we know the most about.
  • 3. Actually, pretty much everybody thinks it's obvious that NP contains more problems than P,
  • it's just that we haven't been able to prove it.
  • The bad news for fast solutions came in the early 1970s where complexity researches realized that
  • dozens of those NP problems they were struggling with were essentially all the same problem!
  • (with some easy polynomial time complications thrown in here and there)
  • These are call "NP-complete" problems, and since that first batch in the 70s we've added Sudoku and protein folding,
  • and problems underlying puzzles and games like Battleship, FreeCell, Master Mind, Tetris, Minesweeper, and making up crossword puzzles.
  • Even classic video games like Super Mario Bros. and Metroid turn out to be connected to NP complete-level traversal problems.
  • NP-complete is yet another Math phrase meaning that these problems include all the really hard parts of every NP problem.
  • A fast program for solving any NP-complete problem could be used to solve every problem in NP.
  • The whole class would instantly collapse.
  • So yeah, amazingly, Sudoku is hard because it involves, literally, the same NP-complete task that makes protein folding hard.
  • If you come up with a profoundly faster way to play Sudoku
  • let somebody know, okay? because fast protein folding would help us cure cancer.
  • But the fact that a bunch of smart people have all been unsuccessful coming up with fast programs
  • to solve what turned out to be, essentially, the same problem, looks like pretty good clue
  • that the fast programs just aren't out there.
  • So why has it been so hard to prove P vs. NP on way or the other?
  • Well, fun fact, proving things is an NP problem.
  • The P vs. NP question itself *is* one of these problems.
  • So yeah, this might be difficult, or not? we don't know.
  • As the field of computational complexity has developed, we've discovered a lot of complexity.
  • The P vs. NP question turns out to be just the main attraction in a huge and complicated "zoo" of complexity classes.
  • Beyond NP there are even harder classes of problems like "EXP" -- the class of problems including figuring out
  • the best move in Chess, that takes exponential time to even check.
  • This whole upper area of problems that are at least as hard as NP-complete is called is called "NP-hard".
  • There's also "Co-NP" -- the class of problems where instead of being easy to check right answers,
  • it's easy to exclude wrong answers, which may or may not be the same of NP.
  • And then there's "P-SPACE", the class of problems that can be solved given unlimited time,
  • but using only a polynomial amount of space for memory.
  • There are also problems that can be solved probabilistically in polynomial time.
  • That class is called "BPP", and it may or may not actually be the same as P.
  • And then there's a quantum computing analog of BPP called BQP.
  • All over the place in here there are complicated little classes that would take a lot of explaining,
  • and, actually some of these turn out to be infinite hierarchies of problems that are slightly more difficult
  • from the ones beneath them.
  • We know there's an exponential hierarchy and there's probably a polynomial hierarchy.
  • And out beyond all of this are problems that are, just not solvable by any computer in any amount of time or space.
  • To me, the amazing thing about this whole complexity zoo is that we're talking literally about
  • what can be computed in a given amount of space, and time.
  • We're not just looking at the nature of computation here,
  • we're looking at the nature of space and time themselves.
  • This mess of computational complexity classes, I think, ultimately has implications for
  • physics and biology and, for our basic understanding of everything.
  • As an example of those implications, here's how Scott Aaronson, a complexity researcher at MIT,
  • explains his intuition about P vs. NP:
  • "If P were equal to NP, then the world would be a profoundly different place than we usually assume it to be.
  • There would be no special value in "creative leaps",
  • no fundamental gap between solving a problem and recognizing the solution once it's found.
  • Everyone who could appreciate a symphony would be Mozart;
  • Everyone who could follow a set-by-step argument would be Gauss."
  • The world around us, the nature of living things and ideas, of art and genius,
  • is molded around the deep structure of computation.
  • In a very real way, something connected to P vs. NP shows up in the struggles of scientists and artists.
  • Chopin once said: "Simplicity is the final achievement.
  • After one has played a vast quantity of notes and more notes, it is simplicity that emerges
  • as the crowning reward of art."
  • And Jack Kerouac put it like this: "One day I will find the right words, and they will be simple".

Download subtitle


Hackerdashery #2

Inspired by the Complexity Zoo wiki: https://complexityzoo.uwaterloo.ca/Complexity_Zoo

For more advanced reading, I highly recommend Scott Aaronson's blog, Shtetl-Optimized: http://www.scottaaronson.com/blog/


Retro-fabulous, cabinet-sized computers:

System/360: http://en.wikipedia.org/wiki/IBM_System/360

photo: "360-91-panel". Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:360-91-panel.jpg#mediaviewer/File:360-91-panel.jpg

PDP-8: http://en.wikipedia.org/wiki/PDP-8

photo: "PDP-8". Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:PDP-8.jpg#mediaviewer/File:PDP-8.jpg


Protein folding illustration: "Protein folding schematic" by Tomixdf (talk) - Own work (Original text: “self-made”). Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Protein_folding_schematic.png#mediaviewer/File:Protein_folding_schematic.png

P vs. NP opinion poll: http://www.cs.umd.edu/~gasarch/papers/poll2012.pdf