Set cover problem

From formulasearchengine
Jump to navigation Jump to search

The set covering problem (SCP) is a classical question in combinatorics, computer science and complexity theory.

It is a problem "whose study has led to the development of fundamental techniques for the entire field" of approximation algorithms.[1] It was also one of Karp's 21 NP-complete problems shown to be NP-complete in 1972.

Given a set of elements (called the universe) and a set of sets whose union equals the universe, the set cover problem is to identify the smallest subset of whose union equals the universe. For example, consider the universe and the set of sets . Clearly the union of is . However, we can cover all of the elements with the following, smaller number of sets: .

More formally, given a universe and a family of subsets of , a cover is a subfamily of sets whose union is . In the set covering decision problem, the input is a pair and an integer ; the question is whether there is a set covering of size or less. In the set covering optimization problem, the input is a pair , and the task is to find a set covering that uses the fewest sets.

The complexity of Set Cover Problem is where is the size of the universe and is the number of sets in the collection . The decision version of set covering is NP-complete, and the optimization version of set cover is NP-hard .Template:Sfn

If each set is assigned a cost, it becomes a weighted set cover problem.

Template:Covering-Packing Problem Pairs

Integer linear program formulation

The minimum set cover problem can be formulated as the following integer linear program (ILP).[2]

minimize (minimize the number of sets)
subject to for all (cover every element of the universe)
for all . (every set is either in the set cover or not)

This ILP belongs to the more general class of ILPs for covering problems. The integrality gap of this ILP is at most , so its relaxation gives a factor- approximation algorithm for the minimum set cover problem (where is the size of the universe).[3]

Hitting set formulation

Set covering is equivalent to the hitting set problem. It is easy to see this by observing that an instance of set covering can be viewed as an arbitrary bipartite graph, with sets represented by vertices on the left, the universe represented by vertices on the right, and edges representing the inclusion of elements in sets. The task is then to find a minimum cardinality subset of left-vertices which covers all of the right-vertices. In the Hitting set problem, the objective is to cover the left-vertices using a minimum subset of the right vertices. Converting from one problem to the other is therefore achieved by interchanging the two sets of vertices.

Greedy algorithm

The greedy algorithm for set covering chooses sets according to one rule: at each stage, choose the set that contains the largest number of uncovered elements. It can be shown[4] that this algorithm achieves an approximation ratio of , where is the size of the set to be covered, is the -th harmonic number:

This greedy algorithm actually achieves an approximation ratio of where is the maximum cardinality set of . For δ-dense instances, there exists, however, a -approximation algorithm for every .[5]

Tight example for the greedy algorithm with k=3

There is a standard example on which the greedy algorithm achieves an approximation ratio of . The universe consists of elements. The set system consists of pairwise disjoint sets with sizes respectively, as well as two additional disjoint sets , each of which contains half of the elements from each . On this input, the greedy algorithm takes the sets , in that order, while the optimal solution consists only of and . An example of such an input for is pictured on the right.

Inapproximability results show that the greedy algorithm is essentially the best-possible polynomial time approximation algorithm for set cover (see Inapproximability results below), under plausible complexity assumptions.

Low-frequency systems

If each element occurs in at most f sets, then a solution can be found in polynomial time that approximates the optimum to within a factor of f using LP relaxation.[6]

Inapproximability results

When refers to the size of the universe, Template:Harvtxt showed that set covering cannot be approximated in polynomial time to within a factor of , unless NP has quasi-polynomial time algorithms. Feige (1998) improved this lower bound to under the same assumptions, which essentially matches the approximation ratio achieved by the greedy algorithm. Template:Harvtxt established a lower bound of , where is a constant, under the weaker assumption that PNP. A similar result with a higher value of was recently proved by Template:Harvtxt.

Related problems

See also

Notes

  1. Template:Harvtxt
  2. Template:Harvtxt
  3. Template:Harvtxt
  4. Chvatal, V. A Greedy Heuristic for the Set-Covering Problem. Mathematics of Operations Research Vol. 4, No. 3 (Aug., 1979), pp. 233-235
  5. Template:Harvnb
  6. Template:Harvtxt

References

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

  • {{#invoke:citation/CS1|citation

|CitationClass=conference }}

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}.

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}

  • {{#invoke:citation/CS1|citation

|CitationClass=citation }}

External links