MPS (format)
Encyclopedia
MPS is a file format for presenting and archiving linear programming
(LP) and mixed integer programming problems.
LP product and has emerged as a de facto standard ASCII
medium among most of the commercial LP solvers. Essentially all commercial LP solvers accept this format, and it is also accepted by the open-source COIN-OR
system. Other public domain software may require a customized reader routine in order to read MPS files. However with the acceptance of algebraic modeling language
s MPS usage has declined. For example, according to the NEOS server statistics in January 2011 less than 1% of submissions were in MPS form compared to 59.4% of AMPL
and 29.7% of GAMS
submissions.
MPS is column-oriented (as opposed to entering the model as equations), and all model components (variables, rows, etc.) receive names. MPS is an old format, so it is set up for punch cards: Fields start in column 2, 5, 15, 25, 40 and 50. Sections of an MPS file are marked by so-called header cards, which are distinguished by their starting in column 1. Although it is typical to use upper-case throughout the file for historical reasons, many MPS-readers will accept mixed-case for anything except the header cards, and some allow mixed-case anywhere. The names that you choose for the individual entities (constraints or variables) are not important to the solver; one should pick meaningful names, or easy names for a post-processing code to read.
NAME TESTPROB
ROWS
N COST
L LIM1
G LIM2
E MYEQN
COLUMNS
XONE COST 1 LIM1 1
XONE LIM2 1
YTWO COST 4 LIM1 1
YTWO MYEQN -1
ZTHREE COST 9 LIM2 1
ZTHREE MYEQN 1
RHS
RHS1 LIM1 5 LIM2 10
RHS1 MYEQN 7
BOUNDS
UP BND1 XONE 4
LO BND1 YTWO -1
UP BND1 YTWO 1
ENDATA
For comparison, here is the same model written out in an equation-oriented format:
Optimize
COST: XONE + 4*YTWO + 9*ZTHREE
Subject To
LIM1: XONE + YTWO <= 5
LIM2: XONE + ZTHREE >= 10
MYEQN: - YTWO + ZTHREE = 7
Bounds
XONE <= 4
-1 <= YTWO <= 1
End
Strangely, nothing in MPS format specifies the direction of optimization, and there is no standard "default" direction; some LP solvers will maximize if not instructed otherwise, others will minimize, and still others put safety first and have no default and require a selection somewhere in a control program or by a calling parameter. If the model is formulated for minimization and the solver requires maximization (or vice versa), it is easy to convert between the two by negating all coefficients. The optimal value of the objective function will then be the negative of the original optimal value, but the values of the variables themselves will be correct.
The COLUMNS section contains the entries of the A-matrix. All entries for a given column must be placed consecutively, although within a column the order of the entries (rows) is irrelevant. Rows not mentioned for a column are implied to have a coefficient of zero.
The RHS section allows one or more right-hand-side vectors to be defined; there is seldom more than one. In the above example, the name of the RHS vector is RHS1, and has non-zero values in all 3 of the constraint rows of the problem. Rows not mentioned in an RHS vector would be assumed to have a right-hand-side of zero.
The optional BOUNDS section specifies lower and upper bounds on individual variables, if they are not given by rows in the matrix. All the bounds that have a given name in column 5 are taken together as a set. Variables not mentioned in a given BOUNDS set are taken to be non-negative (lower bound zero, no upper bound). A bound of type UP means an upper bound is applied to the variable. A bound of type LO means a lower bound is applied. A bound type of FX ("fixed") means that the variable has upper and lower bounds equal to a single value. A bound type of FR ("free") means the variable has neither lower nor upper bounds and so can take on negative values. A variation on that is MI for free negative, giving an upper bound of 0 but no lower bound. Bound type PL is for a free positive for zero to plus infinity, but as this is the normal default, it is seldom used. There are also bound types for use in MIP models - BV for binary, being 0 or 1. UI for upper integer and LI for lower integer. SC stands for semi-continuous and indicates that the variable may be zero, but if not must be equal to at least the value given.
Another optional section called RANGES specifies double-inequalities, in a somewhat counterintuitive way not described here. Ways to mark integer variables are also beyond the scope of this article. The final card must be ENDATA (notice the odd spelling).
A few special cases of the MPS standard are not consistently handled by implementations. In the BOUNDS section, if a variable is given a nonpositive upper bound but no lower bound, its lower bound may default to zero or to minus inifinity. If an integer variable has no upper bound specified, its upper bound may default to one rather than to plus infinity.
.
Although some extensions are not standardized, the format is still in general use.
Linear programming
Linear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
(LP) and mixed integer programming problems.
Overview
The format was named after an early IBMIBM
International Business Machines Corporation or IBM is an American multinational technology and consulting corporation headquartered in Armonk, New York, United States. IBM manufactures and sells computer hardware and software, and it offers infrastructure, hosting and consulting services in areas...
LP product and has emerged as a de facto standard ASCII
ASCII
The American Standard Code for Information Interchange is a character-encoding scheme based on the ordering of the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that use text...
medium among most of the commercial LP solvers. Essentially all commercial LP solvers accept this format, and it is also accepted by the open-source COIN-OR
COIN-OR
COIN-OR, which stands for Computational Infrastructure for Operations Research, is a project that aims to "create for mathematical software what the open literature is for mathematical theory." The open literature provides the OR community with a peer-review process and an archive...
system. Other public domain software may require a customized reader routine in order to read MPS files. However with the acceptance of algebraic modeling language
Algebraic modeling language
Algebraic Modeling Languages are high-level computer programming languages for describing and solving high complexity problems for large scale mathematical computation...
s MPS usage has declined. For example, according to the NEOS server statistics in January 2011 less than 1% of submissions were in MPS form compared to 59.4% of AMPL
AMPL
AMPL, an acronym for "A Mathematical Programming Language", is an algebraic modeling language for describing and solving high-complexity problems for large-scale mathematical computation AMPL, an acronym for "A Mathematical Programming Language", is an algebraic modeling language for describing and...
and 29.7% of GAMS
General Algebraic Modeling System
The General Algebraic Modeling System is a high-level modeling system for mathematical optimization. GAMS is designed for modeling and solving linear, nonlinear, and mixed-integer optimization problems. The system is tailored for complex, large-scale modeling applications and allows the user to...
submissions.
MPS is column-oriented (as opposed to entering the model as equations), and all model components (variables, rows, etc.) receive names. MPS is an old format, so it is set up for punch cards: Fields start in column 2, 5, 15, 25, 40 and 50. Sections of an MPS file are marked by so-called header cards, which are distinguished by their starting in column 1. Although it is typical to use upper-case throughout the file for historical reasons, many MPS-readers will accept mixed-case for anything except the header cards, and some allow mixed-case anywhere. The names that you choose for the individual entities (constraints or variables) are not important to the solver; one should pick meaningful names, or easy names for a post-processing code to read.
MPS format
Here is a little sample model written in MPS format (explained in more detail below):NAME TESTPROB
ROWS
N COST
L LIM1
G LIM2
E MYEQN
COLUMNS
XONE COST 1 LIM1 1
XONE LIM2 1
YTWO COST 4 LIM1 1
YTWO MYEQN -1
ZTHREE COST 9 LIM2 1
ZTHREE MYEQN 1
RHS
RHS1 LIM1 5 LIM2 10
RHS1 MYEQN 7
BOUNDS
UP BND1 XONE 4
LO BND1 YTWO -1
UP BND1 YTWO 1
ENDATA
For comparison, here is the same model written out in an equation-oriented format:
Optimize
COST: XONE + 4*YTWO + 9*ZTHREE
Subject To
LIM1: XONE + YTWO <= 5
LIM2: XONE + ZTHREE >= 10
MYEQN: - YTWO + ZTHREE = 7
Bounds
XONE <= 4
-1 <= YTWO <= 1
End
Strangely, nothing in MPS format specifies the direction of optimization, and there is no standard "default" direction; some LP solvers will maximize if not instructed otherwise, others will minimize, and still others put safety first and have no default and require a selection somewhere in a control program or by a calling parameter. If the model is formulated for minimization and the solver requires maximization (or vice versa), it is easy to convert between the two by negating all coefficients. The optimal value of the objective function will then be the negative of the original optimal value, but the values of the variables themselves will be correct.
Variables
The NAME record can have any value, starting in column 15. The ROWS section defines the names of all the constraints; entries in column 2 or 3 are E for equality rows, L for less-than ( <= ) rows, G for greater-than ( >= ) rows, and N for non-constraining rows (the first of which would be interpreted as the objective function). The order of the rows named in this section is unimportant.The COLUMNS section contains the entries of the A-matrix. All entries for a given column must be placed consecutively, although within a column the order of the entries (rows) is irrelevant. Rows not mentioned for a column are implied to have a coefficient of zero.
The RHS section allows one or more right-hand-side vectors to be defined; there is seldom more than one. In the above example, the name of the RHS vector is RHS1, and has non-zero values in all 3 of the constraint rows of the problem. Rows not mentioned in an RHS vector would be assumed to have a right-hand-side of zero.
The optional BOUNDS section specifies lower and upper bounds on individual variables, if they are not given by rows in the matrix. All the bounds that have a given name in column 5 are taken together as a set. Variables not mentioned in a given BOUNDS set are taken to be non-negative (lower bound zero, no upper bound). A bound of type UP means an upper bound is applied to the variable. A bound of type LO means a lower bound is applied. A bound type of FX ("fixed") means that the variable has upper and lower bounds equal to a single value. A bound type of FR ("free") means the variable has neither lower nor upper bounds and so can take on negative values. A variation on that is MI for free negative, giving an upper bound of 0 but no lower bound. Bound type PL is for a free positive for zero to plus infinity, but as this is the normal default, it is seldom used. There are also bound types for use in MIP models - BV for binary, being 0 or 1. UI for upper integer and LI for lower integer. SC stands for semi-continuous and indicates that the variable may be zero, but if not must be equal to at least the value given.
Another optional section called RANGES specifies double-inequalities, in a somewhat counterintuitive way not described here. Ways to mark integer variables are also beyond the scope of this article. The final card must be ENDATA (notice the odd spelling).
A few special cases of the MPS standard are not consistently handled by implementations. In the BOUNDS section, if a variable is given a nonpositive upper bound but no lower bound, its lower bound may default to zero or to minus inifinity. If an integer variable has no upper bound specified, its upper bound may default to one rather than to plus infinity.
Limitations
MPS has many limitations. It does not specify the direction of optimization which is handled differently by solvers. Numeric fields have 12 characters width therefore limiting the precision. The representation is neither easy for human interpretation nor compact (although reserves column / row order information, which is often beneficial for LP solver behaviour reproducibility). One of the alternatives to MPS that does not have its limitations and is supported by most solvers is the nl file formatNl (format)
nl is a file format for presenting and archiving mathematical programming problems. It supports linear and nonlinear optimization problems as well as complementarity problems , in discrete or continuous variables...
.
Extensions
Many LP products include extensions to the MPS format. The free format MPS allows for long names and more accurate data by allowing fields to exceed the columns defined by the original standard, and apply whitespaces as separators instead of fixed column positions (note that this makes some MPS files that included whitespaces as part of names to be no longer valid). Some extensions include adding new kind of data to the MPS file (e.g. sections to include objective sense, integrality requirements, quadratic data or advanced MIP modelling constructs).Although some extensions are not standardized, the format is still in general use.
See also
- Linear programmingLinear programmingLinear programming is a mathematical method for determining a way to achieve the best outcome in a given mathematical model for some list of requirements represented as linear relationships...
- MPS file format - a description of the format by the authors of lp_solve
- nl (format)Nl (format)nl is a file format for presenting and archiving mathematical programming problems. It supports linear and nonlinear optimization problems as well as complementarity problems , in discrete or continuous variables...
- modern alternative to MPS - AIMMSAIMMSAIMMS is a software system designed for modeling and solving large-scale optimization and scheduling-type problems....
- AMPLAMPLAMPL, an acronym for "A Mathematical Programming Language", is an algebraic modeling language for describing and solving high-complexity problems for large-scale mathematical computation AMPL, an acronym for "A Mathematical Programming Language", is an algebraic modeling language for describing and...
- GAMSGeneral Algebraic Modeling SystemThe General Algebraic Modeling System is a high-level modeling system for mathematical optimization. GAMS is designed for modeling and solving linear, nonlinear, and mixed-integer optimization problems. The system is tailored for complex, large-scale modeling applications and allows the user to...
- OptimJOptimJOptimJ is an extension of the Java with language support for writing optimization models and abstractions for bulk data processing. OptimJ aims at providing a clear and concise algebraic notation for optimization modeling, removing compatibility barriers between optimization modeling and...
- a Java-based modeling language