Data mapping
Encyclopedia
Data mapping is the process of creating data element
mapping
s between two distinct data model
s. Data mapping is used as a first step for a wide variety of data integration
tasks including:
For example, a company that would like to transmit and receive purchases and invoices with other companies might use data mapping to create data maps from a company's data to standardized ANSI ASC X12
messages for items such as purchase orders and invoices.
(EDI) standards designed to allow a company to exchange data
with any other company, regardless of industry. The standards are maintained by the Accredited Standards Committee X12 (ASC X12), with the American National Standards Institute
(ANSI) accredited to set standards for EDI. The X12 standards are often called ANSI ASC X12
standards.
In the future, tools based on semantic web
languages such as Resource Description Framework
(RDF), the Web Ontology Language
(OWL) and standardized metadata registry
will make data mapping a more automatic process. This process will be accelerated if each application performed metadata publishing
. Full automated data mapping is a very difficult problem (see Semantic translation
).
transforms or by using graphical mapping tools that automatically generate executable transformation programs. These are graphical tools that allow a user to "draw" lines from fields in one set of data to fields in another. Some graphical data mapping tools allow users to "Auto-connect" a source and a destination. This feature is dependent on the source and destination data element name
being the same. Transformation programs are automatically created in SQL, XSLT, Java programming language
or C++
. These kinds of graphical tools are found in most ETL
Tools (Extract, Transform, Load Tools) as the primary means of entering data maps to support data movement.
can be consulted to look up data element synonyms. For example, if the source system lists FirstName but the destination lists PersonGivenName, the mappings will still be made if these data elements are listed as synonyms in the metadata registry. Semantic mapping is only able to discover exact matches between columns of data and will not discover any transformation logic or exceptions between columns.
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
mapping
Map (mathematics)
In most of mathematics and in some related technical fields, the term mapping, usually shortened to map, is either a synonym for function, or denotes a particular kind of function which is important in that branch, or denotes something conceptually similar to a function.In graph theory, a map is a...
s between two distinct data model
Data model
A data model in software engineering is an abstract model, that documents and organizes the business data for communication between team members and is used as a plan for developing applications, specifically how data is stored and accessed....
s. Data mapping is used as a first step for a wide variety of data integration
Data integration
Data integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...
tasks including:
- Data transformationData transformationIn metadata and data warehouse, a data transformation converts data from a source data format into destination data.Data transformation can be divided into two steps:...
or data mediation between a data source and a destination - Identification of data relationships as part of data lineage analysis
- Discovery of hidden sensitive data such as the last four digits social security number hidden in another user id as part of a data masking or de-identification project
- Consolidation of multiple databases into a single data base and identifying redundant columns of data for consolidation or elimination
For example, a company that would like to transmit and receive purchases and invoices with other companies might use data mapping to create data maps from a company's data to standardized ANSI ASC X12
ANSI ASC X12
The Accredited Standards Committee X12 , chartered by the American National Standards Institute more than 30 years ago, develops and maintains EDI and CICA standards along with XML schemas which drive business processes globally...
messages for items such as purchase orders and invoices.
Standards
X12 standards are generic Electronic Data InterchangeElectronic Data Interchange
Electronic data interchange is the structured transmission of data between organizations by electronic means. It is used to transfer electronic documents or business data from one computer system to another computer system, i.e...
(EDI) standards designed to allow a company to exchange data
Data
The term data refers to qualitative or quantitative attributes of a variable or set of variables. Data are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which...
with any other company, regardless of industry. The standards are maintained by the Accredited Standards Committee X12 (ASC X12), with the American National Standards Institute
American National Standards Institute
The American National Standards Institute is a private non-profit organization that oversees the development of voluntary consensus standards for products, services, processes, systems, and personnel in the United States. The organization also coordinates U.S. standards with international...
(ANSI) accredited to set standards for EDI. The X12 standards are often called ANSI ASC X12
ANSI ASC X12
The Accredited Standards Committee X12 , chartered by the American National Standards Institute more than 30 years ago, develops and maintains EDI and CICA standards along with XML schemas which drive business processes globally...
standards.
In the future, tools based on semantic web
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
languages such as Resource Description Framework
Resource Description Framework
The Resource Description Framework is a family of World Wide Web Consortium specifications originally designed as a metadata data model...
(RDF), the Web Ontology Language
Web Ontology Language
The Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...
(OWL) and standardized metadata registry
Metadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...
will make data mapping a more automatic process. This process will be accelerated if each application performed metadata publishing
Metadata publishing
Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
. Full automated data mapping is a very difficult problem (see Semantic translation
Semantic translation
Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model...
).
Hand-coded, graphical manual
Data mappings can be done in a variety of ways using procedural code, creating XSLTXSLT
XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...
transforms or by using graphical mapping tools that automatically generate executable transformation programs. These are graphical tools that allow a user to "draw" lines from fields in one set of data to fields in another. Some graphical data mapping tools allow users to "Auto-connect" a source and a destination. This feature is dependent on the source and destination data element name
Data element name
A data element name is a name given to a data element in, for example, a data dictionary or metadata registry. In a formal data dictionary, there is often a requirement that no two data elements may have the same name, to allow the data element name to become an identifier, though some data...
being the same. Transformation programs are automatically created in SQL, XSLT, Java programming language
Java (programming language)
Java is a programming language originally developed by James Gosling at Sun Microsystems and released in 1995 as a core component of Sun Microsystems' Java platform. The language derives much of its syntax from C and C++ but has a simpler object model and fewer low-level facilities...
or C++
C++
C++ is a statically typed, free-form, multi-paradigm, compiled, general-purpose programming language. It is regarded as an intermediate-level language, as it comprises a combination of both high-level and low-level language features. It was developed by Bjarne Stroustrup starting in 1979 at Bell...
. These kinds of graphical tools are found in most ETL
Extract, transform, load
Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...
Tools (Extract, Transform, Load Tools) as the primary means of entering data maps to support data movement.
Data-driven mapping
This is the newest approach in data mapping and involves simultaneously evaluating actual data values in two data sources using heuristics and statistics to automatically discover complex mappings between two data sets. This approach is used to find transformations between two data sets and will discover substrings, concatenations, arithmetic, case statements as well as other kinds of transformation logic. This approach also discovers data exceptions that do not follow the discovered transformation logic.Semantic mapping
Semantic mapping is similar to the auto-connect feature of data mappers with the exception that a metadata registryMetadata registry
A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...
can be consulted to look up data element synonyms. For example, if the source system lists FirstName but the destination lists PersonGivenName, the mappings will still be made if these data elements are listed as synonyms in the metadata registry. Semantic mapping is only able to discover exact matches between columns of data and will not discover any transformation logic or exceptions between columns.
See also
- ISO/IEC 11179ISO/IEC 11179ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...
- The ISO/IEC Metadata registry standard - MetadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
- Metadata publishingMetadata publishingMetadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
- Schema matchingSchema matchingThe terms schema matching and mapping are often used interchangeably. For this article, we differentiate the two as follows: Schema matching is the process of identifying that two objects are semantically related while mapping refers to the transformations between the objects...
- Semantic mapperSemantic mapperA semantic mapper is tool or service that aids in the transformation of data elements from one namespace into another namespace. A semantic mapper is an essential component of a semantic broker and one tool that is enabled by the Semantic Web technologies....
- Semantic translationSemantic translationSemantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model...
- Semantic webSemantic WebThe Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...
- SemanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
- XSLTXSLTXSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,...
- XML Transformation Language - data integrationData integrationData integration involves combining data residing in different sources and providing users with a unified view of these data.This process becomes significant in a variety of situations, which include both commercial and scientific domains...
- Identity transformIdentity transformThe identity transform is a data transformation that copies the source data into the destination data without change.The identity transformation is considered an essential process in creating a reusable transformation library. By creating a library of variations of the base identity...
- BotsBots (edi)Bots is a open source EDI/b2b translator aiming to be complete EDI software. Bots is free software available under the GNU General Public License.Bots 2.0 is a major rewrite of the GUI; django is now used as a web framework.- Features :...
open source software for data mapping