Molecular descriptor
Encyclopedia
Molecular descriptors play a fundamental role in chemistry, pharmaceutical sciences, environmental protection policy, and health researches, as well as in quality control, being the way molecules, thought of as real bodies, are transformed into numbers, allowing some mathematical treatment of the chemical information contained in the molecule. This was defined by Todeschini and Consonni as:
"The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment."
By this definition, the molecular descriptors are divided into two main categories: experimental measurements, such as log P, molar refractivity, dipole moment, polarizability
, and, in general, physico-chemical properties, and theoretical molecular descriptors, which are derived from a symbolic representation of the molecule and can be further classified according to the different types of molecular representation.
The main classes of theoretical molecular descriptors are: 1) 0D-descriptors (i.e. constitutional descriptors, count descriptors), 1D-descriptors (i.e. list of structural fragments, fingerprints), 2D-descriptors (i.e. graph invariants), 3D-descriptors (such as, for example, 3D-MoRSE descriptors, WHIM descriptors, GETAWAY descriptors, quantum-chemical descriptors, size, steric, surface and volume descriptors), 4D-descriptors (such as those derived from GRID or CoMFA methods, Volsurf).
of molecular descriptors can be defined as the ability of the algorithm for their calculation to give a descriptor value that is independent of the particular characteristics of the molecular representation, such as atom numbering or labeling, spatial reference frame, molecular conformations, etc. Invariance to molecular numbering or labeling is assumed as a minimal basic requirement for any descriptor.
Two other important invariance properties, translational invariance and rotational invariance
, are the invariance of a descriptor value to any translation or rotation of the molecules in the chosen reference frame. These last invariance properties are required for the 3D-descriptors.
In this sense, descriptors can show no degeneracy at all, low, intermediate, or high degeneracy.
For example, the number of molecule atoms and the molecular weights are high degeneracy descriptors, while, usually, 3D-descriptors show low or no degeneracy at all.
2 Should have good correlation with at least one property
3 Should preferably discriminate among isomers
4 Should be possible to apply to local structure
5 Should possible to generalize to "higher" descriptors
6 Should be simple
7 Should not be based on experimental properties
8 Should not be trivially related to other descriptors
9 Should be possible to construct efficiently
10 Should use familiar structural concepts
11 Should change gradually with gradual change in structures
12 Should have the correct size dependence, if related to the molecule size
"The molecular descriptor is the final result of a logic and mathematical procedure which transforms chemical information encoded within a symbolic representation of a molecule into a useful number or the result of some standardized experiment."
By this definition, the molecular descriptors are divided into two main categories: experimental measurements, such as log P, molar refractivity, dipole moment, polarizability
Polarizability
Polarizability is the measure of the change in a molecule's electron distribution in response to an applied electric field, which can also be induced by electric interactions with solvents or ionic reagents. It is a property of matter...
, and, in general, physico-chemical properties, and theoretical molecular descriptors, which are derived from a symbolic representation of the molecule and can be further classified according to the different types of molecular representation.
The main classes of theoretical molecular descriptors are: 1) 0D-descriptors (i.e. constitutional descriptors, count descriptors), 1D-descriptors (i.e. list of structural fragments, fingerprints), 2D-descriptors (i.e. graph invariants), 3D-descriptors (such as, for example, 3D-MoRSE descriptors, WHIM descriptors, GETAWAY descriptors, quantum-chemical descriptors, size, steric, surface and volume descriptors), 4D-descriptors (such as those derived from GRID or CoMFA methods, Volsurf).
Invariance properties of molecular descriptors
The invariance propertiesInvariance
Invariance is a French magazine edited by Jacques Camatte, published since 1968.It emerged from the Italian left-communist tradition associated with Amadeo Bordiga and it originally bore the subtitle "Invariance of the theory of the proletariat", indicating Bordiga's notion of the unchanging nature...
of molecular descriptors can be defined as the ability of the algorithm for their calculation to give a descriptor value that is independent of the particular characteristics of the molecular representation, such as atom numbering or labeling, spatial reference frame, molecular conformations, etc. Invariance to molecular numbering or labeling is assumed as a minimal basic requirement for any descriptor.
Two other important invariance properties, translational invariance and rotational invariance
Rotational invariance
In mathematics, a function defined on an inner product space is said to have rotational invariance if its value does not change when arbitrary rotations are applied to its argument...
, are the invariance of a descriptor value to any translation or rotation of the molecules in the chosen reference frame. These last invariance properties are required for the 3D-descriptors.
Degeneracy of molecular descriptors
This property refers to the ability of a descriptor to avoid equal values for different molecules.In this sense, descriptors can show no degeneracy at all, low, intermediate, or high degeneracy.
For example, the number of molecule atoms and the molecular weights are high degeneracy descriptors, while, usually, 3D-descriptors show low or no degeneracy at all.
Basic requirements for optimal descriptors
1 Should have structural interpretation2 Should have good correlation with at least one property
3 Should preferably discriminate among isomers
4 Should be possible to apply to local structure
5 Should possible to generalize to "higher" descriptors
6 Should be simple
7 Should not be based on experimental properties
8 Should not be trivially related to other descriptors
9 Should be possible to construct efficiently
10 Should use familiar structural concepts
11 Should change gradually with gradual change in structures
12 Should have the correct size dependence, if related to the molecule size
See also
- Mathematical chemistryMathematical chemistryMathematical chemistry is the area of research engaged in novel applications of mathematics to chemistry; it concerns itself principally with the mathematical modeling of chemical phenomena...
- QSAR
- Topological indexTopological indexIn the fields of chemical graph theory, molecular topology, and mathematical chemistry, a topological index also known as a connectivity index is a type of a molecular descriptor that is calculated based on the molecular graph of a chemical compound. Topological indices are numerical parameters...
- Chemical databaseChemical databaseA chemical database is a database specifically designed to store chemical information. This information is about chemical and crystal structures, spectra, reactions and syntheses, and thermophysical data.- Chemical structures :...
- Docking (molecular)
- Cahn-Ingold-Prelog priority ruleCahn-Ingold-Prelog priority ruleThe Cahn–Ingold–Prelog priority rules, CIP system or CIP conventions are a set of rules used in organic chemistry to name the stereoisomers of a molecule. A molecule may contain any number of stereocenters and any number of double bonds, and each gives rise to two possible configurations...