
Knapsack problem
Encyclopedia

Combinatorial optimization
In applied mathematics and theoretical computer science, combinatorial optimization is a topic that consists of finding an optimal object from a finite set of objects. In many such problems, exhaustive search is not feasible...
: Given a set of items, each with a weight and a value, determine the count of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most useful items.
The problem often arises in resource allocation
Resource allocation
Resource allocation is used to assign the available resources in an economic way. It is part of resource management. In project management, resource allocation is the scheduling of activities and the resources required by those activities while taking into consideration both the resource...
with financial constraints. A similar problem also appears in combinatorics
Combinatorics
Combinatorics is a branch of mathematics concerning the study of finite or countable discrete structures. Aspects of combinatorics include counting the structures of a given kind and size , deciding when certain criteria can be met, and constructing and analyzing objects meeting the criteria ,...
, complexity theory
Computational complexity theory
Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty, and relating those classes to each other...
, cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
and applied mathematics
Applied mathematics
Applied mathematics is a branch of mathematics that concerns itself with mathematical methods that are typically used in science, engineering, business, and industry. Thus, "applied mathematics" is a mathematical science with specialized knowledge...
.
The decision problem
Decision problem
In computability theory and computational complexity theory, a decision problem is a question in some formal system with a yes-or-no answer, depending on the values of some input parameters. For example, the problem "given two numbers x and y, does x evenly divide y?" is a decision problem...
form of the knapsack problem is the question "can a value of at least V be achieved without exceeding the weight W?"
Definition
In the following, we have n kinds of items, 1 through n.Each kind of item i has a value vi and a weight wi.
We usually assume that all values and weights are nonnegative. To simplify the representation, we can also assume that the items are listed in increasing order of weight.
The maximum weight that we can carry in the bag is W.
The most common formulation of the problem is the 0-1 knapsack problem, which restricts the number xi of copies of each kind of item
to zero or one.
Mathematically the 0-1-knapsack problem can be formulated as:
- maximize
- subject to
The bounded knapsack problem restricts the number

each kind of item to a maximum integer value

Mathematically the bounded knapsack problem can be formulated as:
- maximize
- subject to
The unbounded knapsack problem (UKP) places no upper bound on the number of copies of each kind of item.
Of particular interest is the special case of the problem with these properties:
- it is a decision problem,
- it is a 0-1 problem,
- for each kind of item, the weight equals the value:
.
Notice that in this special case, the problem is equivalent to this: given a set of nonnegative integers, does any subset of it add up to exactly W? Or, if negative weights are allowed and W is chosen to be zero, the problem is: given a set of integers, does any nonempty subset add up to exactly 0? This special case is called the subset sum problem. In the field of cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...
, the term knapsack problem is often used to refer specifically to the subset sum problem.
If multiple knapsacks are allowed, the problem is better thought of as the bin packing problem.
Computational complexity
The knapsack problem is interesting from the perspective of computer science because- there is a pseudo-polynomial timePseudo-polynomial timeIn computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input ....
algorithm using dynamic programmingDynamic programmingIn mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure... - there is a fully polynomial-time approximation scheme, which uses the pseudo-polynomial time algorithm as a subroutine
- the problem is NP-completeNP-completeIn computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...
to solve exactly, thus it is expected that no algorithm can be both correct and fast (polynomial-time) on all cases - many cases that arise in practice, and "random instances" from some distributions, can nonetheless be solved exactly.
The subset sum version of the knapsack problem is commonly known as one of Karp's 21 NP-complete problems
Karp's 21 NP-complete problems
One of the most important results in computational complexity theory was Stephen Cook's 1971 demonstration of the first NP-complete problem, the boolean satisfiability problem...
.
There have been attempts to use subset sum as the basis for public key cryptography systems, such as the Merkle-Hellman knapsack cryptosystem. These attempts typically used some group
Group (mathematics)
In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...
other than the integers. Merkle-Hellman and several similar algorithms were later broken, because the particular subset sum problems they produced were in fact solvable by polynomial-time algorithms.
One theme in research literature is to identify what the "hard" instances of the knapsack problem look like, or viewed another way, to identify what properties of instances in practice might make them more amenable than their worst-case NP-complete behaviour suggests. name ="poirriez et all 2009">Vincent Poirriez, Nicola Yanev, Rumen Andonov (2009) A Hybrid Algorithm for the Unbounded Knapsack Problem Discrete Optimization http://dx.doi.org/10.1016/j.disopt.2008.09.004
Several algorithms are freely available to solve knapsack problems, based on dynamic programming approach, branch and bound approach or hybridizations of both approaches. name="martellopisingertoth99a">S. Martello, D. Pisinger, P. Toth, Dynamic programming and strong bounds for the 0-1
knapsack problem , Manag. Sci., 45:414–424, 1999. name="plateau85">G. Plateau, M. Elkihel, A hybrid algorithm for the 0-1 knapsack problem, Methods of
Oper. Res., 49:277–293, 1985.
Unbounded knapsack problem
If all weights (
nonnegative integers, the knapsack problem can be solved in pseudo-polynomial time
Pseudo-polynomial time
In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input ....
using dynamic programming
Dynamic programming
In mathematics and computer science, dynamic programming is a method for solving complex problems by breaking them down into simpler subproblems. It is applicable to problems exhibiting the properties of overlapping subproblems which are only slightly smaller and optimal substructure...
. The following describes a dynamic programming solution for the unbounded knapsack problem.
To simplify things, assume all weights are strictly positive (wi > 0). We wish to maximize total value subject to the constraint that total weight is less than or equal to W. Then for each w ≤ W, define m[w] to be the maximum value that can be attained with total weight less than or equal to w. m[W] then is the solution to the problem.
Observe that m[w] has the following properties:
-
(the sum of zero items, i.e., the summation of the empty set)
-
where

Here the maximum of the empty set is taken to be zero. Tabulating the results from







Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...
. Dividing

Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...
is an obvious way to improve the running time.
The

NP-complete
In computational complexity theory, the complexity class NP-complete is a class of decision problems. A decision problem L is NP-complete if it is in the set of NP problems so that any given solution to the decision problem can be verified in polynomial time, and also in the set of NP-hard...
, since






0-1 knapsack problem
A similar dynamic programming solution for the 0-1 knapsack problem also runs in pseudo-polynomial timePseudo-polynomial time
In computational complexity theory, a numeric algorithm runs in pseudo-polynomial time if its running time is polynomial in the numeric value of the input ....
. As above, assume




We can define

-
-
-
if
(the new item is more than the current weight limit)
-
if
.
The solution can then be found by calculating








Another algorithm for 0-1 knapsack, discovered in 1974 and sometimes called "meet-in-the-middle" due to parallels to a similarly-named algorithm in cryptography
Meet-in-the-middle attack
The meet-in-the-middle attack is a cryptographic attack which, like the birthday attack, makes use of a space-time tradeoff. While the birthday attack attempts to find two values in the domain of a function that map to the same value in its range, the meet-in-the-middle attack attempts to find a...
, is exponential in the number of different items but may be preferable to the DP algorithm when


Fixed-point arithmetic
In computing, a fixed-point number representation is a real data type for a number that has a fixed number of digits after the radix point...
), but if the problem requires





The "meet-in-the-middle" algorithm is as follows:
- Partition the set {1...n} into two sets A and B of approximately equal size
- Compute the weights and values of all subsets of each set.
- For each subset of A, find the "best matching" subset of B, i.e. the subset of B of greatest value such that the combined weight is less than W. Keep track of the greatest combined value seen so far.
The algorithm takes


Meet-in-the-middle attack
The meet-in-the-middle attack is a cryptographic attack which, like the birthday attack, makes use of a space-time tradeoff. While the birthday attack attempts to find two values in the domain of a function that map to the same value in its range, the meet-in-the-middle attack attempts to find a...
in cryptography, this improves on the

Greedy approximation algorithm
George DantzigGeorge Dantzig
George Bernard Dantzig was an American mathematical scientist who made important contributions to operations research, computer science, economics, and statistics....
proposed a greedy
Greedy algorithm
A greedy algorithm is any algorithm that follows the problem solving heuristic of making the locally optimal choice at each stagewith the hope of finding the global optimum....
approximation algorithm
Approximation algorithm
In computer science and operations research, approximation algorithms are algorithms used to find approximate solutions to optimization problems. Approximation algorithms are often associated with NP-hard problems; since it is unlikely that there can ever be efficient polynomial time exact...
to solve the unbounded knapsack problem. His version sorts the items in decreasing order of value per unit of weight,



Dominance relations in the UKP
Solving the unbounded knapsack problem can be made easier by throwing away items which will never be needed. For a given item i, suppose we could find a set of items J such that their total weight is less than the weight of i, and their total value is greater than the value of i. Then i cannot appear in the optimal solution, because we could always improve any potential solution containing i by replacing i with the set J. Therefore we can disregard the i-th item altogether. In such cases, J is said to dominate i. (Note that this does not apply to bounded knapsack problems, since we may have already used up the items in J.)Finding dominance relations allows us to significantly reduce the size of the search space. There are several different types of dominance relations, name ="poirriez et all 2009">Vincent Poirriez, Nicola Yanev, Rumen Andonov (2009) A Hybrid Algorithm for the Unbounded Knapsack Problem, section 2) Discrete Optimization http://dx.doi.org/10.1016/j.disopt.2008.09.004 which all satisfy an inequality of the form:



where



Collective dominance
The i-th item is collectively dominated by J, written as




Threshold dominance
The i-th item is threshold dominated by J, written as





defines the threshold of the item i, written


Multiple dominance
The i-th item is multiply dominated by a single item j, written as



i.e.

This dominance could be efficiently used during preprocessing because it can be detected relatively easily.
Modular dominance
Let b be the best item, i.e.
The i-th item is modularly dominated by a single item j, written as




Applications
Knapsack problems can be applied to real-world decision-making processes in a wide variety of fields, such as the finding the least wasteful cutting of raw materials, selection of capital investments and financial portfolios, selection of assets for asset-backed securitization, and generating keys for the Merkle–Hellman knapsack cryptosystem.One early application of knapsack algorithms was in the construction and scoring of tests in which the test-takers have a choice as to which questions they answer. On tests with a homogeneous
Homogeneity and heterogeneity
Homogeneity and heterogeneity are concepts relating to the uniformity or lack thereof in a substance. A material that is homogeneous is uniform in composition or character; one that is heterogeneous lacks uniformity in one of these qualities....
distribution of point values for each question, it is a fairly simple process to provide the test-takers with such a choice. For example, if an exam contains 12 questions each worth 10 points, the test-taker need only answer 10 questions to achieve a maximum possible score of 100 points. However, on tests with a heterogeneous distribution of point values—that is, when different questions or sections are worth different amounts of points— it is more difficult to provide choices. Feuerman and Weiss proposed a system in which students are given a heterogeneous test with a total of 125 possible points. The students are asked to answer all of the questions to the best of their abilities. Of the possible subsets of problems whose total point values add up to 100, a knapsack algorithm would determine which subset gives each student the highest possible score.
History
The knapsack problem has been studied for more than a century, with early works dating as far back as 1897. It is not known how the name "knapsack problem" originated, though the problem was referred to as such in the early works of mathematician Tobias DantzigTobias Dantzig
Tobias Dantzig was a Baltic German Russian American mathematician, the father of George Dantzig, and the author of NUMBER: The Language of Science and Aspects of Science .Born in Latvia, Dantzig studied mathematics with Henri Poincaré in Paris...
(1884–1956) , suggesting that the name could have existed in folklore before a mathematical problem had been fully defined.
The quadratic knapsack problem was first introduced by Gallo, Hammer, and Simeone in 1960.
A 1998 study of the Stony Brook University algorithms repository showed that, out of 75 algorithmic problems, the knapsack problem was the 18th most popular and the 4th most needed after kd-tree
Kd-tree
In computer science, a k-d tree is a space-partitioning data structure for organizing points in a k-dimensional space. k-d trees are a useful data structure for several applications, such as searches involving a multidimensional search key...
s, suffix tree
Suffix tree
In computer science, a suffix tree is a data structure that presents the suffixes of a given string in a way that allows for a particularly fast implementation of many important string operations.The suffix tree for a string S is a tree whose edges are labeled with strings, such that each suffix...
s, and the bin packing problem
Bin packing problem
In computational complexity theory, the bin packing problem is a combinatorial NP-hard problem. In it, objects of different volumes must be packed into a finite number of bins of capacity V in a way that minimizes the number of bins used....
.
See also
- List of knapsack problems
- Packing problemPacking problemPacking problems are a class of optimization problems in mathematics which involve attempting to pack objects together , as densely as possible. Many of these problems can be related to real life packaging, storage and transportation issues...
- Cutting stock problemCutting stock problemThe cutting-stock problem is an optimization problem, or more specifically, an integer linear programming problem. It arises from many applications in industry. Imagine that you work in a paper mill and you have a number of rolls of paper of fixed width waiting to be cut, yet different customers...
- Continuous knapsack problemContinuous knapsack problemThe continuous knapsack problem, also known as the fractional knapsack problem, is similar to the classic knapsack problem but in this problem fractions of an item can be put into the knapsack.The problem is as following:...
External links
- Lecture slides on the knapsack problem
- PYAsUKP: Yet Another solver for the Unbounded Knapsack Problem, with code taking advantage of the dominance relations in an hybride algorithm, benchmarks and downloadable copies of some papers.
- Home page of David Pisinger with downloadable copies of some papers on the publication list (including "Where are the hard knapsack problems?")
- Knapsack Problem solutions in many languages at Rosetta CodeRosetta CodeRosetta Code is a wiki-based programming chrestomathy website with solutions to various programming problems in many different programming languages. It was created in 2007 by Mike Mol. Rosetta Code includes 450 programming tasks, and covers 351 programming languages...
- Dynamic Programming algorithm to 0/1 Knapsack problem
- 0-1 Knapsack Problem in Python
- Interactive JavaScript branch-and-bound solver
- Solving 0-1-KNAPSACK with Genetic Algorithms in Ruby