Worldwide molecular matrix
Encyclopedia
The World Wide Molecular Matrix (WWMM) is an electronic repository
for unpublished chemical data
. First proposed in 2002 by Peter Murray-Rust
and his colleagues in the chemistry
department at the University of Cambridge
in the United Kingdom
, WWMM provides a free, easily searchable database
for information about thousands of complicated molecules, data that would otherwise remain inaccessible to scientists.
Murray-Rust, a chemical informatics
specialist, has estimated that 80% of the results produced by chemists around the world is never published in scientific journals. Most of this data is not ground-breaking, yet it could conceivably be of use to scientists doing related projects—if they could access it. The WWMM was proposed as a solution to this problem. It would house the results of experiments on over 100,000 molecules in physical chemistry
, organic chemistry
, biochemistry
and medicinal chemistry.
In other scientific fields, the need for a similar depository to house inaccessible information could be more acute. In a presentation at the "CERN
Workshop on Innovations in Scholarly Communications (OAI4
)", Murray-Rust said that chemistry actually leads other fields in published data. He estimated that as much as 99% of the data in some scientific fields never reaches publication.
Although scientific in nature, the WWMM is part of the broader open archives
and open source
movements, pushes to make more and more information freely available to any user via the Internet
or World Wide Web. In his CERN
presentation, Murray-Rust stated that the WWMM was a "response to the expense of [scientific] journals," and he asked the rhetorical question, "Can we win the war to make data open, or will it be absorbed into the publishing
and pseudo-publishing world?" Murray-Rust and his colleagues are also responsible for the development of the Chemical Mark-up Language (CML
), a variant of XML
intended for chemists.
Repository (publishing)
A repository in publishing, and especially in academic publishing,is a real or virtual facility for the deposit of academic publications, such as academic journal articles....
for unpublished chemical data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
. First proposed in 2002 by Peter Murray-Rust
Peter Murray-Rust
Peter Murray-Rust is a contemporary chemist born in Guildford in 1941.He was educated at Bootham School and Balliol College, Oxford. After obtaining a D.Phil he became lecturer in chemistry at the University of Stirling and was first warden of Andrew Stewart Hall of Residence...
and his colleagues in the chemistry
Chemistry
Chemistry is the science of matter, especially its chemical reactions, but also its composition, structure and properties. Chemistry is concerned with atoms and their interactions with other atoms, and particularly with the properties of chemical bonds....
department at the University of Cambridge
University of Cambridge
The University of Cambridge is a public research university located in Cambridge, United Kingdom. It is the second-oldest university in both the United Kingdom and the English-speaking world , and the seventh-oldest globally...
in the United Kingdom
United Kingdom
The United Kingdom of Great Britain and Northern IrelandIn the United Kingdom and Dependencies, other languages have been officially recognised as legitimate autochthonous languages under the European Charter for Regional or Minority Languages...
, WWMM provides a free, easily searchable database
Database
A database is an organized collection of data for one or more purposes, usually in digital form. The data are typically organized to model relevant aspects of reality , in a way that supports processes requiring this information...
for information about thousands of complicated molecules, data that would otherwise remain inaccessible to scientists.
Murray-Rust, a chemical informatics
Cheminformatics
Cheminformatics is the use of computer and informational techniques, applied to a range of problems in the field of chemistry. These in silico techniques are used in pharmaceutical companies in the process of drug discovery...
specialist, has estimated that 80% of the results produced by chemists around the world is never published in scientific journals. Most of this data is not ground-breaking, yet it could conceivably be of use to scientists doing related projects—if they could access it. The WWMM was proposed as a solution to this problem. It would house the results of experiments on over 100,000 molecules in physical chemistry
Physical chemistry
Physical chemistry is the study of macroscopic, atomic, subatomic, and particulate phenomena in chemical systems in terms of physical laws and concepts...
, organic chemistry
Organic chemistry
Organic chemistry is a subdiscipline within chemistry involving the scientific study of the structure, properties, composition, reactions, and preparation of carbon-based compounds, hydrocarbons, and their derivatives...
, biochemistry
Biochemistry
Biochemistry, sometimes called biological chemistry, is the study of chemical processes in living organisms, including, but not limited to, living matter. Biochemistry governs all living organisms and living processes...
and medicinal chemistry.
In other scientific fields, the need for a similar depository to house inaccessible information could be more acute. In a presentation at the "CERN
CERN
The European Organization for Nuclear Research , known as CERN , is an international organization whose purpose is to operate the world's largest particle physics laboratory, which is situated in the northwest suburbs of Geneva on the Franco–Swiss border...
Workshop on Innovations in Scholarly Communications (OAI4
Open Archives Initiative
The Open Archives Initiative is an attempt to build a "low-barrier interoperability framework" for archives containing digital content . It allows people to harvest metadata...
)", Murray-Rust said that chemistry actually leads other fields in published data. He estimated that as much as 99% of the data in some scientific fields never reaches publication.
Although scientific in nature, the WWMM is part of the broader open archives
Open Archives Initiative
The Open Archives Initiative is an attempt to build a "low-barrier interoperability framework" for archives containing digital content . It allows people to harvest metadata...
and open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...
movements, pushes to make more and more information freely available to any user via the Internet
Internet
The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite to serve billions of users worldwide...
or World Wide Web. In his CERN
CERN
The European Organization for Nuclear Research , known as CERN , is an international organization whose purpose is to operate the world's largest particle physics laboratory, which is situated in the northwest suburbs of Geneva on the Franco–Swiss border...
presentation, Murray-Rust stated that the WWMM was a "response to the expense of [scientific] journals," and he asked the rhetorical question, "Can we win the war to make data open, or will it be absorbed into the publishing
Publishing
Publishing is the process of production and dissemination of literature or information—the activity of making information available to the general public...
and pseudo-publishing world?" Murray-Rust and his colleagues are also responsible for the development of the Chemical Mark-up Language (CML
Chemical Markup Language
CML is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on XML Schema, the most robust and widely used system for precise information management in many areas...
), a variant of XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
intended for chemists.
See also
- The open archives initiative (OAI)Open Archives InitiativeThe Open Archives Initiative is an attempt to build a "low-barrier interoperability framework" for archives containing digital content . It allows people to harvest metadata...
- The science of InformaticsInformatics (academic field)Informatics is the science of information, the practice of information processing, and the engineering of information systems. Informatics studies the structure, algorithms, behavior, and interactions of natural and artificial systems that store, process, access and communicate information...
- Chemical Mark-up language (CML)Chemical Markup LanguageCML is an approach to managing molecular information using tools such as XML and Java. It was the first domain specific implementation based strictly on XML, first based on a DTD and later on XML Schema, the most robust and widely used system for precise information management in many areas...