David Shmoys
Encyclopedia
David Shmoys is currently a Professor in both the School of Operations Research and Information Engineering
http://www.orie.cornell.edu/ and the Department of Computer Science
http://www.cs.cornell.edu/ at Cornell University
. He obtained his Ph.D. from the University of California, Berkeley
in 1984. His major focus has been in the design and analysis of algorithms
for discrete optimization problems.
In particular, his work has highlighted the role of linear programming
in the design of approximation algorithms for NP-hard
problems. He is known for his pioneering research on providing first constant factor performance guarantee for several scheduling and clustering problems including the k-center and k-median problems and the generalized assignment problem. Polynomial-time approximation scheme
s that he developed for scheduling
problems have found applications in many subsequent works. His current research includes stochastic optimization
, computational sustainability and optimization techniques in computational biology. Shmoys is married to Éva Tardos
, who is a Jacob Gould Schurman Professor of Computer Science at Cornell University.
(1) Constant factor approximation algorithm for the Generalized Assignment Problem
and Unrelated Parallel Machine Scheduling.
(2) Constant factor approximation algorithm for k-Medians and Facility location problem.
Both these contributions are described briefly below:
is a joint work by David Shmoys and Eva Tardos.
The generalized assignment problem can be viewed as the following problem of scheduling unrelated parallel machine with costs.
Each of independent jobs (denoted ) have to be processed by exactly one of unrelated parallel machines (denoted ). Unrelated implies same job might take different amount of processing time on different machines. Job takes time units when processed by machine and incurs a cost . Given and , we wish to decide if there exists a schedule with total cost at most such that for each machine its load, the total processing time required for the jobs assigned to it is at most . By scaling the processing times, we can assume, with out loss of generality, that the machine load bounds satisfy . ``In other words, generalized assignment problem is to find a schedule of minimum cost subject to the constraint that the makespan, that the maximum machine load is at most ".
The work of Shmoys with Lenstra
and Tardos cited here
gives a 2 approximation algorithm for the unit cost case. The algorithm is based on a clever design of linear program using parametric pruning and then rounding an extreme point solution of the linear program deterministically. Algorithm for the
generalized assignment problem is based on a similar LP through parametric pruning and then using a new rounding technique on a carefully designed bipartite graph. We now state the LP formulation and briefly describe the rounding technique.
We guess the optimum value of makespan and write the following LP. This technique is known as parametric pruning.
;
The obtained LP solution is then rounded to an integral solution as follows. A weighted bipartite graph
is constructed. One side of the bipartite graph contains the job nodes, , and the other side contains several copies of each machine node,
, where . To construct the edges to machine nodes corresponding to say machine , first jobs are arranged in decreasing order of processing time . For simplicity, suppose, . Now find the minimum index , such that
. Include in all the edges with nonzero and set their weights to . Create the edge and set its weight to . This ensures that the total weight of the edges incident on the vertex is at most 1. If , then create an edge with weight . Continue with assigning edges to in a similar way.
In the bipartite graph thus created, each job node in has a total edge weight of 1 incident on it, and each machine node in has edges with total weight at most 1 incident on it. Thus the vector is an instance of a fractional matching on and thus it can be rounded to obtain an integral matching of same cost.
Now considering the ordering of processing times of the jobs on the machines nodes during construction of and using an easy charging argument, the following theorem can be proved:
Theorem: If has a feasible solution then a schedule can be constructed with makespan
and cost .
Since , a 2 approximation is obtained.
is a joint work by Moses Charikar, Sudipto Guha, Eva Tardos
and David Shmoys. They obtain a approximation to the metric k medians problem. This was the first paper to break the previously best known approximation.
Shmoys has also worked extensively on the facility location
problem. His recent results include obtaining a approximation algorithm for the capacitated facility location problem. The joint work with Fabian Chudak,
has resulted in improving the previous known approximation for the same problem. Their algorithm is based on a variant of randomized rounding
called the randomized rounding with a backup, since a backup solution is incorporated to correct for the fact that the ordinary randomized rounding rarely generates a feasible solution to the associated set covering problem.
For the uncapacitated version of the facility location problem, again in a joint work with Chudak
he obtained a -approximation algorithm which was a significant improvement on the previously known approximation guarantees.
The improved algorithm works by rounding an optimal fractional solution of a linear programming relaxation and using the properties of the optimal solutions of the linear program and a generalization of a decomposition technique.
Cornell University College of Engineering
The College of Engineering is a division of Cornell University that was founded in 1870 as the Sibley College of Mechanical Engineering and Mechanic Arts...
http://www.orie.cornell.edu/ and the Department of Computer Science
Cornell University College of Arts and Sciences
The College of Arts and Sciences is a division of Cornell University. It has been part of the university since its founding, although its name has changed over time. It grants bachelors degrees, and masters and doctorates through affiliation with the Cornell University Graduate School...
http://www.cs.cornell.edu/ at Cornell University
Cornell University
Cornell University is an Ivy League university located in Ithaca, New York, United States. It is a private land-grant university, receiving annual funding from the State of New York for certain educational missions...
. He obtained his Ph.D. from the University of California, Berkeley
University of California, Berkeley
The University of California, Berkeley , is a teaching and research university established in 1868 and located in Berkeley, California, USA...
in 1984. His major focus has been in the design and analysis of algorithms
Analysis of algorithms
To analyze an algorithm is to determine the amount of resources necessary to execute it. Most algorithms are designed to work with inputs of arbitrary length...
for discrete optimization problems.
In particular, his work has highlighted the role of linear programming
Linear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
in the design of approximation algorithms for NP-hard
NP-hard
NP-hard , in computational complexity theory, is a class of problems that are, informally, "at least as hard as the hardest problems in NP". A problem H is NP-hard if and only if there is an NP-complete problem L that is polynomial time Turing-reducible to H...
problems. He is known for his pioneering research on providing first constant factor performance guarantee for several scheduling and clustering problems including the k-center and k-median problems and the generalized assignment problem. Polynomial-time approximation scheme
Polynomial-time approximation scheme
In computer science, a polynomial-time approximation scheme is a type of approximation algorithm for optimization problems ....
s that he developed for scheduling
Scheduling (computing)
In computer science, a scheduling is the method by which threads, processes or data flows are given access to system resources . This is usually done to load balance a system effectively or achieve a target quality of service...
problems have found applications in many subsequent works. His current research includes stochastic optimization
Optimization (mathematics)
In mathematics, computational science, or management science, mathematical optimization refers to the selection of a best element from some set of available alternatives....
, computational sustainability and optimization techniques in computational biology. Shmoys is married to Éva Tardos
Éva Tardos
Éva Tardos is a Hungarian mathematician, winner of the Fulkerson Prize , and professor of Computer Science at Cornell University.Research Interests:...
, who is a Jacob Gould Schurman Professor of Computer Science at Cornell University.
Key Contributions
Two of his key contributions are(1) Constant factor approximation algorithm for the Generalized Assignment Problem
Generalized assignment problem
In applied mathematics, the maximum generalized assignment problem is a problem in combinatorial optimization. This problem is a generalization of the assignment problem in which both tasks and agents have a size...
and Unrelated Parallel Machine Scheduling.
(2) Constant factor approximation algorithm for k-Medians and Facility location problem.
Both these contributions are described briefly below:
Generalized Assignment Problem & Unrelated Parallel Machine Scheduling
The paperis a joint work by David Shmoys and Eva Tardos.
The generalized assignment problem can be viewed as the following problem of scheduling unrelated parallel machine with costs.
Each of independent jobs (denoted ) have to be processed by exactly one of unrelated parallel machines (denoted ). Unrelated implies same job might take different amount of processing time on different machines. Job takes time units when processed by machine and incurs a cost . Given and , we wish to decide if there exists a schedule with total cost at most such that for each machine its load, the total processing time required for the jobs assigned to it is at most . By scaling the processing times, we can assume, with out loss of generality, that the machine load bounds satisfy . ``In other words, generalized assignment problem is to find a schedule of minimum cost subject to the constraint that the makespan, that the maximum machine load is at most ".
The work of Shmoys with Lenstra
Jan Karel Lenstra
Jan Karel Lenstra is a Dutch mathematician and operations researcher, known for his work on scheduling algorithms, local search, and the travelling salesman problem....
and Tardos cited here
gives a 2 approximation algorithm for the unit cost case. The algorithm is based on a clever design of linear program using parametric pruning and then rounding an extreme point solution of the linear program deterministically. Algorithm for the
generalized assignment problem is based on a similar LP through parametric pruning and then using a new rounding technique on a carefully designed bipartite graph. We now state the LP formulation and briefly describe the rounding technique.
We guess the optimum value of makespan and write the following LP. This technique is known as parametric pruning.
;
-
- ;
- ;
- ;
- ;
The obtained LP solution is then rounded to an integral solution as follows. A weighted bipartite graph
is constructed. One side of the bipartite graph contains the job nodes, , and the other side contains several copies of each machine node,
, where . To construct the edges to machine nodes corresponding to say machine , first jobs are arranged in decreasing order of processing time . For simplicity, suppose, . Now find the minimum index , such that
. Include in all the edges with nonzero and set their weights to . Create the edge and set its weight to . This ensures that the total weight of the edges incident on the vertex is at most 1. If , then create an edge with weight . Continue with assigning edges to in a similar way.
In the bipartite graph thus created, each job node in has a total edge weight of 1 incident on it, and each machine node in has edges with total weight at most 1 incident on it. Thus the vector is an instance of a fractional matching on and thus it can be rounded to obtain an integral matching of same cost.
Now considering the ordering of processing times of the jobs on the machines nodes during construction of and using an easy charging argument, the following theorem can be proved:
Theorem: If has a feasible solution then a schedule can be constructed with makespan
and cost .
Since , a 2 approximation is obtained.
K-medians and Facility Location Problem
The paperis a joint work by Moses Charikar, Sudipto Guha, Eva Tardos
Éva Tardos
Éva Tardos is a Hungarian mathematician, winner of the Fulkerson Prize , and professor of Computer Science at Cornell University.Research Interests:...
and David Shmoys. They obtain a approximation to the metric k medians problem. This was the first paper to break the previously best known approximation.
Shmoys has also worked extensively on the facility location
Facility location
Facility location, also known as location analysis, is a branch of operations research and computational geometry concerning itself with mathematical modeling and solution of problems concerning optimal placement of facilities in order to minimize transportation costs, avoid placing hazardous...
problem. His recent results include obtaining a approximation algorithm for the capacitated facility location problem. The joint work with Fabian Chudak,
has resulted in improving the previous known approximation for the same problem. Their algorithm is based on a variant of randomized rounding
Randomized rounding
Within computer science and operations research,many combinatorial optimization problems are computationally intractable to solve exactly ....
called the randomized rounding with a backup, since a backup solution is incorporated to correct for the fact that the ordinary randomized rounding rarely generates a feasible solution to the associated set covering problem.
For the uncapacitated version of the facility location problem, again in a joint work with Chudak
he obtained a -approximation algorithm which was a significant improvement on the previously known approximation guarantees.
The improved algorithm works by rounding an optimal fractional solution of a linear programming relaxation and using the properties of the optimal solutions of the linear program and a generalization of a decomposition technique.