Metadata registry
Encyclopedia
A metadata registry is a central location in an organization where metadata
definitions are stored and maintained in a controlled method.
Central to the charter of any metadata management programme is the process of creating trusting relationships with stakeholders and that definitions and structures have been reviewed and approved by appropriate parties.
The International Organization for Standardization
(ISO) has published standards for a metadata registry called ISO/IEC 11179
and also ISO15000-3 and ISO15000-4 ebXML registry and repository (regrep) EbXML RegRep
"Of interest is that the ISO 11179 model was one of the inputs to the ebXML RIM (registry information model) and so has much functional equivalence to the "registry" region of the ISO 11179 conceptual model." http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6331&rep=rep1&type=pdf
This is however incorrect. Although the specification ebRIM v2.0 (5 december 2001) says at the beginning in its Design Objectives: "Leverage as much as possible the work done in the OASIS [OAS] and the ISO 11179 [ISO] Registry models" http://xml.coverpages.org/OASIS-ebXML-RIM-v20.pdf
by the time of ebRIM v3.0 (2 May 2005) all reference to ISO/IOEC 11179 is reduced to a mention under informative references on page 76 of 78. http://docs.oasis-open.org/regrep/v3.0/specs/regrep-rim-3.0-os.pdf
It was recognised by some team members that the ebXML RIM data model had no place to store "fine grained artifacts" http://www.stylusstudio.com/xmldev/200503/post70270.html ie. the data elements which are at the heart of ISO/IEC 11179, but not until 2009 can an explicit and definitive statement from the team be found. http://sourceforge.net/mailarchive/forum.php?thread_name=4ACBC943.8090609%40wellfleetsoftware.com&forum_name=ebxmlrr-tech
Originally the standard named itself a "data element" registry. It describes data elements: "data elements are the fundamental units of data" and "data elements themselves contain various kinds of data that include characters, images, sound, etc."
It also describes a registry with an analogy: "This is analogous to the registries maintained by governments to keep track of motor vehicles. A description of each motor vehicle is entered in the registry, but not the vehicle itself."
It also says that it is
It also describes itself with "...this familiar metaphor. An ebXML Registry is like your local library. The repository is like the bookshelves in the library. The repository items in the repository are like book (sic) on the bookshelves." It goes on to say "The registry is like the card catalog … A RegistryObject is like a card in the card catalog."
What should be immediately apparent is that something which holds catalogue cards is not "like" a catalogue, it IS a catalogue.
Unfortunately for a number of organisations that have implemented ebXML RIM to satisfy a requirement for an ISO/IEC 11179 registry, ebXML RIM
It is
or data modeling team.
Data elements are frequently assigned to data steward
s or data stewardship teams that are responsible for the maintenance of individual data elements.
consists of making data element definitions and structures available to both people and other systems.
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
definitions are stored and maintained in a controlled method.
Use of Metadata Registries
Metadata registries are used whenever data must be used consistently within an organization or group of organizations. Examples of these situations include:- Organizations that transmit data using structures such as XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
, Web Services or EDIElectronic Data InterchangeElectronic data interchange is the structured transmission of data between organizations by electronic means. It is used to transfer electronic documents or business data from one computer system to another computer system, i.e... - Organizations that need consistent definitions of data across time, between databases, between organizations or between processes, for example when an organization builds a data warehouseData warehouseIn computing, a data warehouse is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting.A data warehouse...
- Organizations that are attempting to break down "silos" of information captured within applications or proprietary file formats
Central to the charter of any metadata management programme is the process of creating trusting relationships with stakeholders and that definitions and structures have been reviewed and approved by appropriate parties.
Common characteristics of a metadata registry
A metadata registry typically has the following characteristics:- Protected environment where only authorized individuals may make changes
- Stores data elementData elementIn metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
s that include both semanticsSemanticsSemantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....
and representationsRepresentation classA representation term is a word, or a combination of words, used as part of a data element name. Representation class is sometimes used as a synonym for representation term.... - Semantic areas of a metadata registry contain the meaning of a data element with precise definitions
- Representational areas of a metadata registry define how the data is represented in a specific format, such as in a database or a structured file format (e.g., XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
)
Clear separation of semantics and system-specific constraints
Because metadata registries are used to store both semantics (the meaning of a data element) and systems-specific constraints (for example the maximum length of a string) it is important to identify what systems impose these constraints and to document them. For example the maximum length of a string should not change the meaning of a data element.The International Organization for Standardization
International Organization for Standardization
The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...
(ISO) has published standards for a metadata registry called ISO/IEC 11179
ISO/IEC 11179
ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...
and also ISO15000-3 and ISO15000-4 ebXML registry and repository (regrep) EbXML RegRep
ISO standards
There are two ISO standards which are commonly referred to as metadata standards: ISO 11179 and ISO 15000-3. There are some who believe that they are interchangeable or at least in some way similar. eg."Of interest is that the ISO 11179 model was one of the inputs to the ebXML RIM (registry information model) and so has much functional equivalence to the "registry" region of the ISO 11179 conceptual model." http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.113.6331&rep=rep1&type=pdf
This is however incorrect. Although the specification ebRIM v2.0 (5 december 2001) says at the beginning in its Design Objectives: "Leverage as much as possible the work done in the OASIS [OAS] and the ISO 11179 [ISO] Registry models" http://xml.coverpages.org/OASIS-ebXML-RIM-v20.pdf
by the time of ebRIM v3.0 (2 May 2005) all reference to ISO/IOEC 11179 is reduced to a mention under informative references on page 76 of 78. http://docs.oasis-open.org/regrep/v3.0/specs/regrep-rim-3.0-os.pdf
It was recognised by some team members that the ebXML RIM data model had no place to store "fine grained artifacts" http://www.stylusstudio.com/xmldev/200503/post70270.html ie. the data elements which are at the heart of ISO/IEC 11179, but not until 2009 can an explicit and definitive statement from the team be found. http://sourceforge.net/mailarchive/forum.php?thread_name=4ACBC943.8090609%40wellfleetsoftware.com&forum_name=ebxmlrr-tech
ISO/IEC 11179
ISO/IEC 11179 says that it is concerned with "traditional" metadata: "We limit the scope of the term as it is used here in ISO/IEC 11179 to descriptions of data - the more traditional use of the term."Originally the standard named itself a "data element" registry. It describes data elements: "data elements are the fundamental units of data" and "data elements themselves contain various kinds of data that include characters, images, sound, etc."
It also describes a registry with an analogy: "This is analogous to the registries maintained by governments to keep track of motor vehicles. A description of each motor vehicle is entered in the registry, but not the vehicle itself."
ebXML
The ebXML RIM says about its Repository and Registry that it is- "... capable of storing any type of electronic content such as XML documents, text documents, images, sound and video … RepositorytItems (sic) are stored in a content repository".
It also says that it is
- "... capable of storing standardized metadata that MAY be used to further describe RepositoryItems" which metadata "… are stored in the registry".
It also describes itself with "...this familiar metaphor. An ebXML Registry is like your local library. The repository is like the bookshelves in the library. The repository items in the repository are like book (sic) on the bookshelves." It goes on to say "The registry is like the card catalog … A RegistryObject is like a card in the card catalog."
What should be immediately apparent is that something which holds catalogue cards is not "like" a catalogue, it IS a catalogue.
Unfortunately for a number of organisations that have implemented ebXML RIM to satisfy a requirement for an ISO/IEC 11179 registry, ebXML RIM
- is neither a registry
- nor does it store metadata.
It is
- a "content repository"
- and a "metacontent catalogueMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
".
Metadata registry roles
A metadata registry is frequently set up and administered by an organization's data architectData architect
A data architect is a person responsible for ensuring that the data assets of an organization are supported by an architecture supporting the organization in achieving its strategic goals. The architecture should cover databases, data integration and the means to get to the data. Usually the data...
or data modeling team.
Data elements are frequently assigned to data steward
Data steward
In metadata, a data steward is a person that is responsible for maintaining a data element in a metadata registry. A data steward may share some responsibilities with a data custodian....
s or data stewardship teams that are responsible for the maintenance of individual data elements.
Metadata element workflow
Metadata registries frequently have a formal data element submission, approval and publishing approval process. Each data element should be accepted by a data stewardship team and reviewed before data elements are published. After publication change control processes should be used.Metadata navigation, search and publishing
Metadata registries are frequently large and complex structures and require navigation, visualization and searching tools. Use of hierarchical viewing tools are frequently an essential part of a metadata registry system. Metadata publishingMetadata publishing
Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
consists of making data element definitions and structures available to both people and other systems.
Examples of public metadata registries
- Agency for Healthcare Research and Quality- United States Health Information Knowledgebase (USHIK) http://ushik.ahrq.gov/
- Apelon Medical Registry http://sage.wherever.org/deployment/deployment.html
- Australian Institute of Health and Welfare http://meteor.aihw.gov.au/content/index.phtml/itemId/181162
- Dublin Core Metadata Registry http://dublincore.org/dcregistry/
- Knowledge Network for Biocomplexity http://knb.ecoinformatics.org
- Cancer Data Standards Repository http://ncicb.nci.nih.gov/NCICB/infrastructure/cacore_overview/cadsr
- Global Justice XML Data Model (GJXDM) http://www.it.ojp.gov/jxdm/
- Minnesota Department of Education Metadata Registry (K-12 Data)http://education.state.mn.us/mde-dd
- Minnesota Department of Revenue Property Taxation (Real Estate Transactions) http://proptax.mdor.state.mn.us/mdr
- National Information Exchange Model http://www.niem.gov
- National Science Digital Library (NSDL) Metadata Registry http://metadataregistry.org
- NIST ebXML Registry for HL7 / HIMSS / IHE http://hcxw2k1.nist.gov:9080/
- US Department of Defense Metadata Registry (requires sponsored registration) http://metadata.dod.mil
- US Environmental Protection Agency - Environmental Data Registry http://www.epa.gov/edr
Metadata registry vendors / solutions
In alphabetical order:- a.k.a. software by Synercon
- Data Advantage Group MetaCenter
- Data Foundations Metadata Registry
- InfoLibrarian Metadata Integration Framework
- Jumper 2.0Jumper 2.0Jumper 2.0, is an open source web application script for collaborative search and knowledge management powered by a shared enterprise bookmarking engine that is a fork of KnowledgebasePublisher[]. It was publicly announced on 29 September 2008,...
open-source Enterprise 2.0Enterprise 2.0Enterprise 2.0 is the use of "Web 2.0" technologies within an organization to enable or streamline business processes while enhancing collaboration - connecting people through the use of social-media tools. Enterprise 2.0 aims to help employees, customers and suppliers collaborate, share, and...
metadata registry - Masai Technologies M:GRID
- Octagon Research Solutions ViewPoint MDR
- SAS Metadata Repository
- The Society of Motion Picture and Television Engineers Metadata Dictionary; Registry of Metadata Element Descriptions
- freebXML Registry, A royalty-free open source project implementing OASIS ebXML RegRep standard
- Wellfeet Software's WellGEO RegREP product provides an integrated Registry and Repository specialized for Geographical Information (GI) management
- http://www.dcgroupinc.com/ Data Consulting Group
See also
In alphabetical order:- Controlled vocabularyControlled vocabularyControlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other form of knowledge organization systems...
- Data dictionaryData dictionaryA data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." The term may have one of several closely related meanings pertaining to...
- Data elementData elementIn metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...
- Domain Specific LanguageDomain-specific programming languageIn software development and domain engineering, a domain-specific language is a programming language or specification language dedicated to a particular problem domain, a particular problem representation technique, and/or a particular solution technique...
(DSL) - Domain-Specific ModelingDomain-Specific ModelingDomain-specific modeling is a software engineering methodology for designing and developing systems, such as computer software. It involves systematic use of a domain-specific language to represent the various facets of a system...
(DSM) - ebXML RegRep (ebXML Registry and Repository)
- Global Justice XML Data Model (GJXDM or Global JXDM)GJXDMThe Global Justice XML Data Model is a data reference model for the exchange of information within the justice and public safety communities...
- ISO/IEC 11179ISO/IEC 11179ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...
- ISO 15000
- Knowledge tagging
- Meta-Object FacilityMeta-Object FacilityThe Meta-Object Facility is an Object Management Group standard for model-driven engineering. The official reference page may be found at OMG's website.- Overview :...
(MOF) - MetadataMetadataThe term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...
- Metadata Online Registry (METeOR)METEORMETEOR is a metric for the evaluation of machine translation output. The metric is based on the harmonic mean of unigram precision and recall, with recall weighted higher than precision...
- Metadata publishingMetadata publishingMetadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes....
- MetamodelingMetamodelingMetamodeling, or meta-modeling in software engineering and systems engineering among other disciplines, is the analysis, construction and development of the frames, rules, constraints, models and theories applicable and useful for modeling a predefined class of problems...
- Model Transformation LanguageModel Transformation LanguageA model transformation language in systems and software engineering is a language for model transformation.- Overview :The notion of model transformation is of central importance to information technology. A software system may be seen as a set of information transformations...
(MTL) - Model-based testingModel-based testingModel-based testing is the application of Model based design for designing and optionally executing the necessary artifacts to perform software testing. Models can be used to represent the desired behavior of the System Under Test , or to represent the desired testing strategies and testing...
(MBT) - Model-driven engineeringModel-driven engineeringModel-driven engineering is a software development methodology which focuses on creating and exploiting domain models , rather than on the computing concepts...
- National Information Exchange ModelNational Information Exchange ModelThe National Information Exchange Model is an XML-based information exchange framework from the United States. NIEM represents a collaborative partnership of agencies and organizations across all levels of government and with private industry...
(NIEM) - Object Constraint LanguageObject Constraint LanguageThe Object Constraint Language is a declarative language for describing rules that apply to Unified Modeling Language models developed at IBM and now part of the UML standard. Initially, OCL was only a formal specification language extension to UML. OCL may now be used with any Meta-Object...
(OCL) - Ontology (computer science)Ontology (computer science)In computer science and information science, an ontology formally represents knowledge as a set of concepts within a domain, and the relationships between those concepts. It can be used to reason about the entities within that domain and may be used to describe the domain.In theory, an ontology is...
- Queries/Views/TransformationQVTQVT is a standard set of languages for model transformation defined by the Object Management Group .- Overview :...
(QVT) - Simple Metadata RegistrySimple Metadata RegistryIn metadata the Simple Metadata Registry is a simplified version of the ISO/IEC 11179 metadata registry specification. SMDR uses a REST interface to create a unique URI for each data element in a metadata registry.-Philosophy behind SMDR:...
- VIsual Automated model TRAnsformationVIATRAThe VIATRA framework is the core of a transformation-based verification and validation environment for improving the quality of systems designed using the Unified Modeling Language by automatically checking consistency, completeness, and dependability requirements.- Target Application Domains...
(VIATRA) - XML Metadata InterchangeXML Metadata InterchangeThe XML Metadata Interchange is an Object Management Group standard for exchanging metadata information via Extensible Markup Language .It can be used for any metadata whose metamodel can be expressed in Meta-Object Facility ....
(XMI)