Cobham's thesis
From Wikipedia, the free encyclopedia
Cobham's thesis asserts that computational problems can be feasibly computed on some computational device only if they can be computed in polynomial time; that is, if they lie in the complexity class P.[2]
Formally, to say that a problem can be solved in polynomial time is to say that there exists an algorithm that, given an n-bit instance of the problem as input, can produce a solution in time O(nc), where c is a constant that depends on the problem but not the particular instance of the problem.
Alan Cobham's 1965 paper entitled "The intrinsic computational difficulty of functions"[3] is one of the earliest mentions of the concept of the complexity class P, consisting of problems decidable in polynomial time. Cobham theorized that this complexity class was a good way to describe the set of feasibly computable problems. Any problem that cannot be contained in P is not feasible, but if a real-world problem can be solved by an algorithm existing in P, generally such an algorithm will eventually be discovered.
The class P is a useful object of study because it is not sensitive to the details of the model of computation: for example, a change from a single-tape Turing machine to a multi-tape machine can lead to a quadratic speedup, but any algorithm that runs in polynomial time under one model also does so on the other.
[edit] Reasoning
The thesis is widely considered to be a good rule of thumb for real-life problems. Typical input lengths that users and programmers are interested in are between 100 and 1,000,000, approximately. Consider an input length of n=100 and a polynomial algorithm whose running time is n2. This is a typical running time for a polynomial algorithm. (See the "Objections" section for a discussion of atypical running times.) The number of steps that it will require, for n=100, is 1002=10000. A typical CPU will be able to do approximately 109 operations per second (this is extremely simplified). So this algorithm will finish on the order of (10000 ÷109) = .00001 seconds. A running time of .00001 seconds is reasonable, and that's why this is called a practical algorithm. The same algorithm with an input length of 1,000,000 will take on the order of 17 minutes, which is also a reasonable time for most (non-real-time) applications.
Meanwhile, an algorithm that runs in exponential time might have a running time of 2n. The number of operations that it will require, for n=100, is 2100. It will take (2100 ÷ 109) ≈ 1.3×1021 seconds, which is (1.3×1021 ÷ 31556926) ≈ 4.1×1013 years. The largest problem this algorithm could solve in a day would have n=46, which seems very small.
[edit] Objections
| This section may contain original research or unverified claims. Please improve the article by adding references. See the talk page for details. (May 2008) |
There are many lines of objection to Cobham's thesis. The thesis essentially states that "P" means "easy, fast, and practical," while "not in P" means "hard, slow, and impractical." But this is not always true. To begin with, it abstracts away some important variables that influence the runtime in practice:
- It ignores constant factors and lower-order terms.
- It ignores the size of the exponent. The time hierarchy theorem proves the existence of problems in P requiring arbitrarily large exponents.
- It ignores the typical size of the input.
All three are related, and are general complaints about analysis of algorithms, but they particularly apply to Cobham's thesis since it makes an explicit claim about practicality. Under Cobham's thesis, we are to call a problem for which the best algorithm takes 10100n instructions feasible, and a problem with an algorithm that takes 20.00001 n infeasible—even though we could never solve an instance of size n=1 with the former algorithm, whereas we could solve an instance of the latter problem of size n=106 without difficulty. As we saw, it takes a day on a typical modern machine to process 2n operations when n=46; this may be the size of inputs we have, and the amount of time we have to solve a typical problem, making the 2n-time algorithm feasible in practice on the inputs we have. Conversely, in fields where practical problems have millions of variables (such as Operations Research or Electronic Design Automation), even O(n3) algorithms are often impractical.
The study of polynomial time approximation schemes considers families of polynomial-time algorithms where the exponent may depend on the desired error rate ε. For example, a PTAS algorithm may have runtime O(n(1/ε)!). This has been a practical problem in the design of approximation algorithms, and was mitigated by the introduction of efficient and fully polynomial-time approximation schemes; see polynomial time approximation scheme for details.
The typical response to this objection is that in general, algorithms for computational problems that naturally occur in computer science and in other fields have neither enormous nor minuscule constants and exponents. Even when a polynomial-time algorithm has a large exponent, historically, other algorithms are subsequently developed that improve the exponent. Also, typically, when computers are finally able to solve problems of size n fast enough, users of computers ask them to solve problems of size n+1. These are all subjective statements.
Another objection is that Cobham's thesis only considers the worst-case inputs. Some NP-hard problems can be solved quite effectively on many of the large inputs that appear in practice; in particular, the travelling salesman problem and the boolean satisfiability problem can often be solved exactly even with input sizes in the tens or hundreds of thousands. Even though the same algorithms can be made to fail (take too long, use too much memory, or trigger an error condition) on some inputs, users may be content with only sometimes getting solutions.
In some cases, in fact, an algorithm that might require exponential time is the technique of choice because it only does so on highly contrived examples. The simplex method is one such example, typically preferred to polynomial-time algorithms for the same problem because it runs in linear time on most practical input even though it occasionally requires exponential time. Another example is the typechecking algorithm for Standard ML. The graph at the top of this article shows the empirical complexity for solving a particular subset of the knapsack problem, where the problem size increases but the individual numbers are bounded in size. For problems that fit this description, modern exponential-time algorithms demonstrate a quadratic growth over approximately 4 orders of magnitude on typical instances, which may be sufficient for many applications.[4].
This is an example of an algorithm that runs in polynomial time if the size of their inputs is bounded. These problems are called fixed-parameter tractable and are said to be solvable in pseudo-polynomial time; they are studied by the subfield of parameterized complexity. For instance, the subset sum problem, the 0-1 knapsack problem and the multiple-choice subset-sum problem are all solvable in linear time, provided the weights and profits are bounded by a constant[5].
Cobham's thesis also ignores other models of computation. A problem that requires taking exponential time to find the exact solution might allow for a fast approximation algorithm that returns a solution that is almost correct. Allowing the algorithm to make random choices, or to sometimes make mistakes, might allow an algorithm to run in polynomial time rather than exponential time. Though this is currently believed to be unlikely (see RP, BPP), in practice, randomized algorithms are often the fastest algorithms available for a problem (quicksort, for example, or the Miller–Rabin primality test). Finally, quantum computers are able to solve in polynomial time some problems that have no known polynomial time algorithm on current computers, such as Shor's algorithm for integer factorization, but this is not currently a practical concern since large-scale quantum computers are not yet available.
[edit] References
- ^ Pisinger, D. 2003. "Where are the hard knapsack problems?" Technical Report 2003/08, Department of Computer Science, University of Copenhagen, Copenhagen, Denmark
- ^ Steven Homer and Alan L. Selman (1992). "Complexity Theory". in Alan Kent and James G. Williams. Encyclopedia of Computer Science and Technology. 26. CRC Press.
- ^ Alan Cobham (1965). "The intrinsic computational difficulty of functions". Proc. Logic, Methodology, and Philosophy of Science II. North Holland.
- ^ Goulimis C. N. (2007). "Appeal to NP-Completeness Considered Harmful". Interfaces, Vol. 37, No. 6, November–December 2007, pp. 584–586
- ^ Pisinger D (1999). "Linear Time Algorithms for Knapsack Problems with Bounded Weights". Journal of Algorithms, Volume 33, Number 1, October 1999, pp. 1-14
- Lloyd, Seth. "Computational capacity of the universe". Accessed March 31, 2008.

