OpenFormula
Encyclopedia
OpenFormula is a draft open standard
for exchanging recalculated formulae in spreadsheet
s, as well as the name of the project to refine this specification. OpenFormula is a draft addition to the OpenDocument
standard (ISO/IEC 26300). OpenFormula was proposed and initially drafted by David A. Wheeler
.
s that are displayed on the screen (through its reuse of the MathML
standard). It is also fully capable of exchanging spreadsheet
data, formats, pivot tables, and other information typically included in a spreadsheet. OpenDocument can exchange spreadsheet formulae (formulae that are recalculated in the spreadsheet); formulae are exchanged as values of the attribute table:formula.
However, many believed that the syntax
and semantics
of table:formula was not defined in sufficient detail. Version 1.0 of the specification defines spreadsheet formulae using a set of simple examples which show, for example, how to specify ranges and the SUM function. Some critics argue that a more detailed, precise specification for spreadsheet functions, including syntax and semantics, should be created to augment these examples.
The OpenDocument committee argued that this was outside their scope at that time. They declared, "A comment was submitted concerning the (inclusion) of a grammar for spreadsheet formulae which conforming implementations should support. While we think that having interoperability on that level would be of great benefit to users, we do not believe that this is in the scope of the current specification. Especially since it is not specifically related to the actual XML format the specification describes. The TC will work on a solution concerning the documentation of interoperability standards that go beyond what is defined in the specification".
Others have argued that, while the specification is less specific than one might like, the intent is fairly clear (especially since formulae tend to follow decades-long traditions), and also because the vast majority of spreadsheets only use a small set of functions (such as SUM) which are universally supported by all spreadsheet implementations anyway. In practice, many developers look to OpenOffice.org as a "canonical implementation"; since its code is public for anyone to review, and its XML output can be inspected, this can resolve many questions.
, began drafting a specification for formulae; his first draft was released in February 2005. This began a process of discussion with various spreadsheet implementors and developers.
In October 2005, Wheeler publicly began an informal project, backed by the OpenDocument Fellowship, to create a draft formula specification, based on the initial draft and on discussions since that time with various implementors. By January 2006, the group had developed a lengthy specification, and implementors had begun changing their implementations to meet the draft specification.
formally created the formula subcommittee, naming Wheeler as the subcommittee chair. After discussion, the subcommittee agreed to use the OpenFormula project's document as their base document. Thus, by February 2006, OASIS had a draft formula specification with a detailed framework and over 100 functions defined.
's Brian Jones noted that OpenDocument did not define spreadsheet formulae in detail. However, at the time Microsoft's competing proprietary XML
format also did not include this kind of detailed specification for formulae.
Microsoft continued to protest that OpenDocument could not be used because it did not define a format for spreadsheet formulae, while its own specification continued to omit any specification about formulae until April 2006. In May 2006, Microsoft also began defining formulae in its XML format, 15 months after the first version of OpenFormula and three months after OASIS posted its first official draft of its specification.
The Office Open XML spreadsheet formula language is now part of the international standard
ISO/IEC IS 29500:2008.
Microsoft Office 2007
SP2 uses the formula language defined in ISO/IEC IS 29500:2008 for OpenDocument spreadsheets. Microsoft stated that they consider adding support for an official ODF formula language (OpenFormula), once a future version of the OpenDocument (ISO/IEC 26300) standard specification includes one.
It is expected that users will often request implementations that meet a particular group, based on their needs.
The document was issued as Part 2 of Version 1.2 of the Open Document Format (http://docs.oasis-open.org/office/v1.2/OpenDocument-v1.2-part2.html).
Note that many implementors are implementing the specification while it is being written, modifying their applications where necessary to comply with the draft standard. As most of the draft has remained unchanged for some time, many implementations already implement a majority of the draft standard.
Open standard
An open standard is a standard that is publicly available and has various rights to use associated with it, and may also have various properties of how it was designed . There is no single definition and interpretations vary with usage....
for exchanging recalculated formulae in spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...
s, as well as the name of the project to refine this specification. OpenFormula is a draft addition to the OpenDocument
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....
standard (ISO/IEC 26300). OpenFormula was proposed and initially drafted by David A. Wheeler
David A. Wheeler
David A. Wheeler is a computer scientist. He is best known for his work on Open source software/Free-libre software and Computer security.-Open Source Software:...
.
Discussion of need
OpenDocument 1.0 is a specification for the exchange of office documents, and is fully capable of describing mathematical formulaFormula
In mathematics, a formula is an entity constructed using the symbols and formation rules of a given logical language....
s that are displayed on the screen (through its reuse of the MathML
MathML
Mathematical Markup Language is an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web pages and other documents...
standard). It is also fully capable of exchanging spreadsheet
Spreadsheet
A spreadsheet is a computer application that simulates a paper accounting worksheet. It displays multiple cells usually in a two-dimensional matrix or grid consisting of rows and columns. Each cell contains alphanumeric text, numeric values or formulas...
data, formats, pivot tables, and other information typically included in a spreadsheet. OpenDocument can exchange spreadsheet formulae (formulae that are recalculated in the spreadsheet); formulae are exchanged as values of the attribute table:formula.
However, many believed that the syntax
Syntax
In linguistics, syntax is the study of the principles and rules for constructing phrases and sentences in natural languages....
and semantics
Semantics
Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
of table:formula was not defined in sufficient detail. Version 1.0 of the specification defines spreadsheet formulae using a set of simple examples which show, for example, how to specify ranges and the SUM function. Some critics argue that a more detailed, precise specification for spreadsheet functions, including syntax and semantics, should be created to augment these examples.
The OpenDocument committee argued that this was outside their scope at that time. They declared, "A comment was submitted concerning the (inclusion) of a grammar for spreadsheet formulae which conforming implementations should support. While we think that having interoperability on that level would be of great benefit to users, we do not believe that this is in the scope of the current specification. Especially since it is not specifically related to the actual XML format the specification describes. The TC will work on a solution concerning the documentation of interoperability standards that go beyond what is defined in the specification".
Others have argued that, while the specification is less specific than one might like, the intent is fairly clear (especially since formulae tend to follow decades-long traditions), and also because the vast majority of spreadsheets only use a small set of functions (such as SUM) which are universally supported by all spreadsheet implementations anyway. In practice, many developers look to OpenOffice.org as a "canonical implementation"; since its code is public for anyone to review, and its XML output can be inspected, this can resolve many questions.
Project
One of the external commentors on OpenDocument, David A. WheelerDavid A. Wheeler
David A. Wheeler is a computer scientist. He is best known for his work on Open source software/Free-libre software and Computer security.-Open Source Software:...
, began drafting a specification for formulae; his first draft was released in February 2005. This began a process of discussion with various spreadsheet implementors and developers.
In October 2005, Wheeler publicly began an informal project, backed by the OpenDocument Fellowship, to create a draft formula specification, based on the initial draft and on discussions since that time with various implementors. By January 2006, the group had developed a lengthy specification, and implementors had begun changing their implementations to meet the draft specification.
OASIS Formula subcommittee
In February 2006, OASISOasis
In geography, an oasis or cienega is an isolated area of vegetation in a desert, typically surrounding a spring or similar water source...
formally created the formula subcommittee, naming Wheeler as the subcommittee chair. After discussion, the subcommittee agreed to use the OpenFormula project's document as their base document. Thus, by February 2006, OASIS had a draft formula specification with a detailed framework and over 100 functions defined.
Microsoft response
In 2005, MicrosoftMicrosoft
Microsoft Corporation is an American public multinational corporation headquartered in Redmond, Washington, USA that develops, manufactures, licenses, and supports a wide range of products and services predominantly related to computing through its various product divisions...
's Brian Jones noted that OpenDocument did not define spreadsheet formulae in detail. However, at the time Microsoft's competing proprietary XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
format also did not include this kind of detailed specification for formulae.
Microsoft continued to protest that OpenDocument could not be used because it did not define a format for spreadsheet formulae, while its own specification continued to omit any specification about formulae until April 2006. In May 2006, Microsoft also began defining formulae in its XML format, 15 months after the first version of OpenFormula and three months after OASIS posted its first official draft of its specification.
The Office Open XML spreadsheet formula language is now part of the international standard
International standard
International standards are standards developed by international standards organizations. International standards are available for consideration and use, worldwide...
ISO/IEC IS 29500:2008.
Microsoft Office 2007
Microsoft Office 2007
Microsoft Office 2007 is a Windows version of the Microsoft Office System, Microsoft's productivity suite. Formerly known as Office 12 in the initial stages of its beta cycle, it was released to volume license customers on November 30, 2006 and made available to retail customers on January 30, 2007...
SP2 uses the formula language defined in ISO/IEC IS 29500:2008 for OpenDocument spreadsheets. Microsoft stated that they consider adding support for an official ODF formula language (OpenFormula), once a future version of the OpenDocument (ISO/IEC 26300) standard specification includes one.
OpenFormula attributes
Key attributes of the OpenFormula specification and development process, many of which are unique to OpenFormula as a recalculated formula format, are:- Developed by many different implementors. OpenFormula is being developed by representatives from many different implementors, working together, including OpenOffice.orgOpenOffice.orgOpenOffice.org, commonly known as OOo or OpenOffice, is an open-source application suite whose main components are for word processing, spreadsheets, presentations, graphics, and databases. OpenOffice is available for a number of different computer operating systems, is distributed as free software...
and Sun StarOfficeStarOfficeStarOffice, known briefly as Oracle Open Office before its discontinuation in 2010, is a proprietary office suite. It was originally developed by StarDivision which was acquired by Sun Microsystems in 1999...
(Eike Rathke), KDE KOfficeKOfficeCalligra Suite is a graphic art and office suite by KDE created from KOffice in 2010. It is available for desktop PCs, tablet computers, and smartphones...
(David Faure and Tomas Mecir), GnumericGnumericGnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project. Gnumeric version 1.0 was released December 31, 2001. Gnumeric is distributed as free software under the GNU GPL license; it is intended to replace proprietary and other spreadsheet programs such as Microsoft...
(Dr. Andreas J. Guelzow and Jody Goldberg), IBM/Lotus 1-2-3Lotus 1-2-3Lotus 1-2-3 is a spreadsheet program from Lotus Software . It was the IBM PC's first "killer application"; its huge popularity in the mid-1980s contributed significantly to the success of the IBM PC in the corporate environment.-Beginnings:...
(Rob Weir), and wikiCalcWikiCalcwikiCalc is a web application, created by Dan Bricklin, that allows for the creation and editing of spreadsheets through a wiki-style user-editable interface...
(Dan Bricklin, co-creator of the spreadsheet). - Developed with experienced users. Many experienced users (such as Tom Metcalf, a scientist specializing in the astrophysicsAstrophysicsAstrophysics is the branch of astronomy that deals with the physics of the universe, including the physical properties of celestial objects, as well as their interactions and behavior...
of the SunSunThe Sun is the star at the center of the Solar System. It is almost perfectly spherical and consists of hot plasma interwoven with magnetic fields...
) take part. The group includes several mathematicians, both users and developers. - Open development. The discussions of the group, and weekly drafts, are available to the public.
- Fully open standard The specification meets all widely-accepted definitions of being an "open standard", including those by Bruce PerensBruce PerensBruce Perens is a computer programmer and advocate in the open source community. He created the Open Source Definition and published the first formal announcement and manifesto of open source. He co-founded the Open Source Initiative with Eric S...
and the European UnionEuropean UnionThe European Union is an economic and political union of 27 independent member states which are located primarily in Europe. The EU traces its origins from the European Coal and Steel Community and the European Economic Community , formed by six countries in 1958...
. For example, (1) both open source software and proprietary software can implement it, and (2) the work is based on consensus, not domination by any single supplier. - Implementors are already implementing it. Implementors have already made changes to their applications due to the work of this body, such as changing how they handle signed values in MOD, the association of exponentiation, and even implementing new functions to conform to the draft standard.
- Focused development. The subcommittee is a large group focused specifically on spreadsheet formulae, and nothing else.
- Not rushed. OpenFormula is based on specification work that was first released on 2005-02-26, as well as a large body of research into different applications.
- Future-proofed format The syntax has been carefully designed to work indefinitely into the future. For example, it allows an arbitrary number of columns, while also allowing arbitrary names of values.
- Embedded test cases. OpenFormula includes a large number of test cases, ones that test and demonstrate the specification including "edge cases" that people often forget. More importantly, they are specially formatted so they can be automatically extracted and placed in a test spreadsheet to test applications. Rob Weir reports that, "This gives us a self-testing specification, a great labor savings, as well as a demonstration of the innovative things you can do with ODF (OpenDocument format)."
- Rigorous definitions The test cases (noted previously) help it be far more rigorous. In addition, OpenFormula defines the types for each function (as prototypes of each function). Function definitions are examined deeply, e.g., YEARFRAC has subtle behavior in the leap years, which were carefully examined and defined.
- Doesn't mandate mistakes. The specification is carefully written to not require certain bugs, just because someone has a bug. For example, Excel incorrectly believes that 1900 was a leap year, and at least draft version 1.3 of the Excel specification claims that compatible applications must make the same mistake, and requires that applications cannot be more capable than Excel by supporting dates before 1900. By comparing many different independent implementations, the OpenFormula group can often detect when an application makes a mistake, and ensure that applications are not overly restricted.
- Innovations from many sources. OpenFormula covers the functions of Excel and OpenOffice.org, plus important functions not found in either one but instead found in other spreadsheet applications, such as Gnumeric and KSpread. For example, the specification includes the functions DECIMAL and BASE, which are much better ways to handle different bases than the old BIN2DEC (etc.) functions. It also includes bit operations like BITAND. These sources include Excel, OpenOffice.org Calc, Sun StarOffice Calc, KDE KOffice Kspread, GNOME Gnumeric, IBM/Lotus 1-2-3, Corel Word Perfect Suite Quattro Pro, wikiCalc, and DocumentToGo's SheetToGo. The subcommittee argues that by including the innovations from around the world of many different independent applications, they produce a better result that is far more inclusive.
- Room for innovation by anyone. Application-specific "namespaces" are defined for functions. This allows spreadsheet applications to add new functions, without interfering with current standard functions, future standard functions, or functions defined by other applications. As a result, different applications can add new functions without interfering with others; once a consensus arises about the new function, it can be standardized. The namespace is based on the Internet's naming service (reversed domain names), so ORG.OPENOFFICE.STYLE would be an OpenOffice.org-unique function.
- Internationalization. The specification does not assume that everyone uses "." as the decimal point, and indeed does not constrain user interfaces at all. Named expressions can have names in local character sets.
- Subset support. Applications can implement a subset or superset. To prevent user confusion, various "groups" are defined so that users can request specific sets of capabilities.
OpenFormula groups
One important aspect of OpenFormula is that it provides a predefined set of "groups"; the most important of these groups are small, medium, and large:- The small group includes a little over 100 functions, including functions for trigonometry, database, finance, and statistics. The vast majority of spreadsheet documents are ably handled by applications that implement the "small" group. At least one PDA application (SheetToGo) has this level of capability, and wikiCalc added the functions in the small group specifically to meet the set defined by OpenFormula.
- The medium group includes all the capabilities of the small group, and adds about 100 more functions.
- The large group includes all the capabilities of the medium group, adding around 130 more functions, as well as capabilities such as complex numbers.
It is expected that users will often request implementations that meet a particular group, based on their needs.
Expected completion time
OpenFormula has exceeded several time frames from 2006, however in June 2007 it was announced that four tasks remain before submission to the quality assurance review.The document was issued as Part 2 of Version 1.2 of the Open Document Format (http://docs.oasis-open.org/office/v1.2/OpenDocument-v1.2-part2.html).
Note that many implementors are implementing the specification while it is being written, modifying their applications where necessary to comply with the draft standard. As most of the draft has remained unchanged for some time, many implementations already implement a majority of the draft standard.
External links
- About OpenFormula, a summary on the OASIS Wiki site
- OASIS OpenDocument Formula subcommitee, website of the subcommittee developing the specification
- OpenDocument - Formula Public Documents