Metadata publishing
Encyclopedia
Metadata publishing is the process of making metadata
Metadata
The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

 data element
Data element
In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

s available to external users, both people and machines using a formal review process and a commitment to change control processes.

Metadata publishing is the foundation upon which advanced distributed computing
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...

 functions are being built. But like building foundations, care must be taken in metadata publishing systems to ensure the structural integrity of the systems built on top of them.

Definition of metadata publishing

Published metadata has the following characteristics:
  1. Metadata structures available to the general public on a public web site or by a download
  2. There is a documented review and approval process for adding or updating data elements to the system
  3. New releases are made available without disturbing prior versions
  4. A publishing organization that makes a commitment to change control process

Benefits of metadata publishing

When classifying benefits of metadata publishing two groups are usually considered. External parties are usually consumers of information that are not part of the publishing organization. Internal parties are usually the various business units or departments within an organization.

Benefits to external parties

  1. Allows external systems (both people and agents) to have a clear understanding of the semantics
    Semantics
    Semantics is the study of meaning. It focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata....

     of data element
    Data element
    In metadata, the term data element is an atomic unit of data that has precise meaning or precise semantics. A data element has:# An identification such as a data element name# A clear data element definition# One or more representation terms...

    s in a system
  2. Allows third parties to build semantic maps between data model
    Data model
    A data model in software engineering is an abstract model, that documents and organizes the business data for communication between team members and is used as a plan for developing applications, specifically how data is stored and accessed....

    s and import and export data between systems
  3. Promotes service oriented architectures and allow horizontal sharing of information between traditional information silo
    Information silo
    An information silo is a management system incapable of reciprocal operation with other, related management systems. A bank's management system, for example, is considered a silo if it cannot exchange information with other related systems within its own organization, or with the management systems...

    s
  4. Allows systems to participate in accurately indexed and federated search
    Federated search
    Federated search is an information retrieval technology that allows the simultaneous search of multiple searchable resources. A user makes a single query request which is distributed to the search engines participating in the federation...

     processes

Benefits to internal parties

  1. allows parties from diverse business units to agree on shared data definitions and separate department or function specific definitions
  2. makes Extract, transform, load
    Extract, transform, load
    Extract, transform and load is a process in database usage and especially in data warehousing that involves:* Extracting data from outside sources* Transforming it to fit operational needs...

     (ETL) operations more precise for data warehousing
  3. allows user interface designers to access a common pool of screen and report header labels
  4. promotion of model-driven architecture
    Model-driven architecture
    Model-driven architecture is a software design approach for the development of software systems. It provides a set of guidelines for the structuring of specifications, which are expressed as models. Model-driven architecture is a kind of domain engineering, and supports model-driven engineering of...


Objections to metadata publishing

  • Organizations that publish their metadata could make it easier for unauthorized people to find sensitive data if they breach an organization's firewall
  • Vendors that publish their metadata risk customers creating tools that could allow their customers to export their data from computer systems therefor making it easier to migrate off of a vendor's system

Core process in metadata publishing

The following are some of the core processes in metadata publishing
  1. Gathering of metadata requirements
  2. Selection of metadata registry and metadata publishing tools
  3. Training of metadata concepts to project participants
  4. Stakeholder group formation
  5. Metadata harvesting
  6. Glossary consolidation
  7. Initial upper ontology construction (abstract data elements)
  8. Draft data element loading
  9. Data element review process
  10. Publishing approved metadata elements in a variety of output formats (see below)
  11. Creation and maintenance of versions and depreciation of unused or redundant data elements

File format metadata publishing

Organizations that create applications that store data in file systems can also publish metadata definitions. One common way to perform this is to store application data in a compressed XML file format. The XML files can be uncompressed and validated against an external XML Schema. An example of this is done by the Open Source FreeMind
FreeMind
FreeMind is a free mind mapping application written in Java. FreeMind is licensed under the GNU General Public License. It provides extensive export capabilities. It runs on Microsoft Windows, Linux and Mac OS X via the Java Runtime Environment....

 tool.

Metadata publishing formats

  1. HTML
    HTML
    HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

     - used for browsing a web site and indexing by text-based search engines
  2. Web Ontology Language
    Web Ontology Language
    The Web Ontology Language is a family of knowledge representation languages for authoring ontologies.The languages are characterised by formal semantics and RDF/XML-based serializations for the Semantic Web...

     (OWL) - used by metadata search engines such as Swoogle
    Swoogle
    Swoogle is a search engine for Semantic Web ontologies, documents, terms and data published on the Web. Swoogle employs a system of crawlers to discover RDF documents and HTMLdocuments with embedded RDF content...

  3. XML Metadata Interchange
    XML Metadata Interchange
    The XML Metadata Interchange is an Object Management Group standard for exchanging metadata information via Extensible Markup Language .It can be used for any metadata whose metamodel can be expressed in Meta-Object Facility ....

     (XMI) - OMG
    Object Management Group
    Object Management Group is a consortium, originally aimed at setting standards for distributed object-oriented systems, and is now focused on modeling and model-based standards.- Overview :...

     standard for exchanging metadata
  4. Common Warehouse Metamodel
    Common Warehouse Metamodel
    The Common Warehouse Metamodel defines a specification for modeling metadata for relational, non-relational, multi-dimensional, and most other objects found in a data warehousing environment...

     (CMW) - OMG
    Object Management Group
    Object Management Group is a consortium, originally aimed at setting standards for distributed object-oriented systems, and is now focused on modeling and model-based standards.- Overview :...

     standard for data warehouse metadata
  5. Topic maps - an ISO
    International Organization for Standardization
    The International Organization for Standardization , widely known as ISO, is an international standard-setting body composed of representatives from various national standards organizations. Founded on February 23, 1947, the organization promulgates worldwide proprietary, industrial and commercial...

     standard for the representation and interchange of knowledge, with an emphasis on the findability
    Findability
    Findability is a term for the ease with which information contained on a website can be found, both from outside the website and by users already on the website. Although findability has relevance outside the World Wide Web, it is usually used in the context of the web...

     of information.
  6. KM3
    KM3
    KM3 or Kernel Meta Meta Model is a neutral language to write metamodels and to define Domain Specific Languages. KM3 has been defined at INRIA and is available under the Eclipse platform.- References :...

     or Kernel Meta Meta Model as used in the Metamodel Zoos. The AtlanticZoo is an open source library of more than 100 metamodels under EPL License. KM3 is a simple Domain Specific Language for specifying metamodels. A number of transformations are available to translate from KM3 to other notations like XMI.

See also

  • Data governance
    Data governance
    Data governance is an emerging discipline with an evolving definition. The discipline embodies a convergence of data quality, data management, data policies, business process management, and risk management surrounding the handling of data in an organization...

  • metadata
    Metadata
    The term metadata is an ambiguous term which is used for two fundamentally different concepts . Although the expression "data about data" is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at...

  • Semantic web
    Semantic Web
    The Semantic Web is a collaborative movement led by the World Wide Web Consortium that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a "web of...

  • Semantic technology
    Semantic technology
    In software, semantic technology encodes meanings separately from data and content files, and separately from application code.This enables machines as well as people to understand, share and reason with them at execution time...

  • Metadata registry
    Metadata registry
    A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.-Use of Metadata Registries:...

  • ISO/IEC 11179
    ISO/IEC 11179
    ISO/IEC 11179 is an international standard for representing metadata for an organization in a metadata registry.- Intended purpose :...

  • Topic Maps

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK