Good question. I was afraid someone might ask "What does P=NP mean?"
Well, where do I start? I need to try to explain it in terms that everyone here can understand but at the same time be precise enough so that if more
fellow computer science and mathematics people are lurking, they won't send
fireballs my way.
First, you need to know what is meant by the run-time order of an algorithm.
Let's say you have a single large wide shelf which contains from left to
right, 1000 manuals for all kinds of projection equipment. Let's say they're
in no particular order. You need to see if there is a Ballantyne Pro-35
manual in the collection. Since the items are unordered, you must, in
some manner (probably by starting at one end and working toward the other)
look through the collection one by one. On the average, you'll have to
look through half the collection, and in worst case, will have to look through
the entire collection. If there are n manuals, the number of manuals you
must look through to find a particualr one is proportional to n, so we say
this search algorithm is O(n) (usually read "Order n"). Note that if the
manuals are sorted alphabetically, the order would be reduced to O(log n),
which would allow finding a particular manual by checking no more than 10
of the manuals, which is a huge improvement, but that's beside the point.
The important point about an O(n) algorithm is that the time it takes to
perform the algorithm grows proportionally to n. That is, if it takes 10
minutes to search a row of 1000 manuals on the average, then it will take
30 minutes on the average to search a row of 3000.
Sorting, if done using a naive algorithm, is O(n squared). (I don't think I
can do superscripts in here). Such an algorithm may be "Take the
alphabetically first manual and swap it with the one currently at the
beginning. Take the alphabetically first manual from books 2-1000 and
swap it with the book currently in position 2. Repeat until the set is
sorted." The number of times this loop is done is proportional to n, and
the finding of the alphabetically first book each time is porportional to n,
so we have an order n squared algorithm. This is not the best way to sort, but
that's beside the point. The point is if it takes 10 minutes to sort 1000
books using this particular O(n^2) algorithm, then it will take 40 minutes to sort 2000 books and 90 minutes to
sort 3000 books. Doubling the size of the problem multiplies the run time
by 4, not 2.
Likewise, there are O(n cubed) and O(n^4) and O(n^5) algorithms, and so on
(searching 3, 4, and 5 dimensional data structures would be examples). Now,
mathematical expressions that are sums of powers of finite powers of n (which
can be multiplied by constants) are called polynomials. An example of a
6 * x^7 + 5 * x^3 - 8 * x^2 - 9 * x + 135
(sorry I can't do superscripts)
Now, if the run time of an algorithm is proportional to a polynomial
expression in n, then the algorithm is said to be a polynomial-time algorithm.
So, sorting the shelf of manuals and searching them for a particular one, are
polynomial-time algorithms, as is performing a linear search of a
1987564-dimensional array that is 1374234 units in each direction. That
would be an O(x^1987564) algorithm, which has polynomial run-time.
P is the set of problems that can be solved in polynomial time on a
OK. Now you know what a polynomial-time algorithm is and what
the set P is. So, what is not a
polynomial run-time algorithm? What about this: You have n wires coming
out of a box and those n wires need to be hooked to n connectors on another
device, but you have no idea which wires go with which connectors. If the
wires are connected improperly, the device does nothing. If connected
properly, it works. How many combinations must be tried? If n is 1, then
there is only one possibility. Hook it up and you're done. If n is 2, then
there are 2 possibilities. (Not bad). If n is 3, then there are 6 ways to
hook up the wires. (3 ways to hook up the first, leaving 2 for the second,
and 1 for the last). If n is 4, then the first wire can go in 4 places, the
second can go 3, and so. Do you see a pattern here? For n wires, the number
of ways they can be hooked up is n times n-1 times n-2, and so on down to 1.
There is a mathematical notation for this. It is written n! and is pronounced
"n factorial". The number of ways to hook up n wires to n connectors is n!
which is non-polynomial. To give you an idea of the growth rate of this
beast, here are the factorials of the first few intergers: 1, 2, 6, 24, 120,
720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800, ... (Those
are the first 13, which are as far as my 10-digit calculator will go). Imagine
if you could try a particular combination of 10 wires in only a second. It
would take up to 42 days to find the right one. With 11 wires, that would be
a year and a half. With 12 wires, that would be 15 years, and with 13,
it would be 197 years. You get the idea. With these types of algorithms,
it doesn't matter how fast the computer is, because increasing the size of
the problem by just 1 increases the run time exponentially.
Another similar problem that has a run-time that isn't quite as bad, but
still horrible is "You have n switches, each can be on or off. The device
will only work if the switches are in the correct positions. Set the
switches properly" This is order 2^n. With 1 switch, there are 2
possibilities, on or off. With 2 switches, there are 2 possibilities: the
first switch can be on or off and the second switch can be on or off. Adding
another switch doubles the previous number of possibilities that must be
tried. The sequence goes 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, etc.
This is exponential growth. It doesn't take too many switches before finding
the correct switch combination takes centuries, even if many combinations
can be tested every second.
We need a mythical mechanism to help us out here. Let's introduce
nondeterminism. In the case of the wiring problem or the switch problem,
let's say we can guess the correct combination. If this idea of guessing
bothers you, let me explain it another way.
Let's say you could make a phone call to God and he could tell you the
correct combination. Now, God gives you a wiring combination or switch
combination. You need to verify that it works. In either case, all you have
to do it try it and see if the device starts working. This verification
stage is important (not because we don't trust God, but because it's
important to the theory here ). So in this case, by using God, we have
a polynomial-time algorithm becuase God gives us the answer in constant time
and we can verify it in linear time (by hooking up the n wires or setting
the n switches).
Look at it another way. Let's say the number of film-tech members is large
enough so that we can have a room full of them. In fact, we have so many
that we have one for each possible wiring combination or switch setting.
Give each film-tech member a bell. I'm running this operation and I give all
of the combinations out, one to each film-tech member. I say "go" and in one
time unit, one of the film-tech members will have a combination that works
and will ring his bell. Let's say John Pytlak rings his bell. I say,
"John, what combination do you have?" I have gotten an answer in constant
time but I must verify it. As with the answer from God, I verify it in the
same manner, in linear time. With this approach, I have a polynomial-time
algorithm since I have constant time answer plus linear time verifiability.
Point: if you have an O(n!) problem and have n! machines, or if you have an
O(2^n) problem and have 2^n machines, you're a good as God.
Now since this is the case, why not just say I can guess the correct
combination to start with, since I can get it from God or from my room full
of n! or 2^n film-techers in constant time. It's the ability to perform the verification of that answer in polynomial time that is important!
The set NP is the set of problems that can be solved in polynomial time on a
nondeterministic computer as I have described in the "God" and "room
full of film-techers" and "guessing" approaches.
It is clear that P is a subset of NP since if a problem can be solved in
polynomial time on a deterministic machine, then it can be solved in
polynomial time on a nondeterministic machine. The ability of a
nondeterministic algorithm to check an exponential number of combinations
in polynomial time leads us to believe that NP includes problems that are
not in P.
THIS HAS NEVER BEEN PROVED! IF SOMEONE PROVES IT, THEY'LL BE THE MOST
IMPORTANT COMPUTER SCIENTIST ALIVE!
P=NP would mean that deterministic polynomial-time algorithms exist for
all nondeterministic polynomial-time problems.
P<=NP is the widespread belief.
You may ask, what would be a problem that is not even in NP?
Here are a couple: Generate all the permutations of the numbers 1 through n".
Sure, God could give me a list of them but verifying would be O(n!) since
the length of the output itself would be O(n!). This is not in P nor NP. Another example is "Generate a list of all of the n-digit numbers." That would be O(10^n) since the output alone is O(10^n).
Here is a problem that is so horrible that there is NO ALGORITHM to solve it:
Given a computer program that produces a yes or no answer based on its logic
and internal state, and given a starting state for the program, will the
program halt in a finite amount of time with an answer?
This is a fascinating area, and it seems that not much progress has been
made since the 1970s. The "standard" book on this, and the theory of
Computers and Intractability: A Guide to the Theory of
NP-Completeness: Garey, Michael R and Johnson, David S.,
W. H. Freeman and Company, 1979
Evans A Criswell
Huntsville-Decatur Movie Theatre Info Site