XML data binding
Encyclopedia
XML data binding refers to a means of representing information in an XML
document as an object
in computer memory. This allows applications
to access the data in the XML from the object rather than using the DOM
or SAX
to retrieve the data from a direct representation of the XML itself.
An XML data binder accomplishes this by automatically creating a mapping between elements of the XML schema
of the document we wish to bind and members
of a class
to be represented in memory.
When this process is applied to convert an XML document to an object, it is called unmarshalling
. The reverse process, to serialize an object as XML, is called marshalling
.
Since XML is inherently sequential and objects are (usually) not, XML data binding mappings often have difficulty preserving all the information in an XML document. Specifically, information like comment
s, XML entity references, and sibling order may fail to be preserved in the object representation created by the binding application. This is not always the case; sufficiently complex data binders are capable of preserving 100% of the information in an XML document.
Similarly, since objects in computer memory are not inherently sequential, and may include links to other objects (including self-referential links), XML data binding mappings often have difficulty preserving all the information about an object when it is marshalled to XML.
An alternative approach to automatic data binding relies instead on hand-crafted XPath
expressions that extract the data from XML. This approach has a number of benefits. First, the data binding code only needs proximate knowledge (e.g., topology, tag names, etc.) of the XML tree structure, which developers can determine by looking at the XML data; XML schemas are no longer mandatory. Furthermore, XPath allows the application to bind the relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to completely unmarshall the entire XML document. The drawback of this approach is the lack of automation in implementing the object model and XPath expressions. Instead the application developers have to create these artifacts manually.
objects across programs, languages, and platforms. You can dump a time series of structured objects from a datalogger written in C on an embedded processor, bring it across the network to process in perl and finally visualize in Mathematica
. The structure and the data remain consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to XML. YAML
, for example, is emerging as a powerful data binding alternative to XML. JSON
(which can be regarded as a subset of YAML) is often suitable for lightweight or restricted applications.
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
document as an object
Object (computer science)
In computer science, an object is any entity that can be manipulated by the commands of a programming language, such as a value, variable, function, or data structure...
in computer memory. This allows applications
Application software
Application software, also known as an application or an "app", is computer software designed to help the user to perform specific tasks. Examples include enterprise software, accounting software, office suites, graphics software and media players. Many application programs deal principally with...
to access the data in the XML from the object rather than using the DOM
Document Object Model
The Document Object Model is a cross-platform and language-independent convention for representing and interacting with objects in HTML, XHTML and XML documents. Aspects of the DOM may be addressed and manipulated within the syntax of the programming language in use...
or SAX
Simple API for XML
SAX is an event-based sequential access parser API developed by the XML-DEV mailing list for XML documents. SAX provides a mechanism for reading data from an XML document that is an alternative to that provided by the Document Object Model...
to retrieve the data from a direct representation of the XML itself.
An XML data binder accomplishes this by automatically creating a mapping between elements of the XML schema
XML schema
An XML schema is a description of a type of XML document, typically expressed in terms of constraints on the structure and content of documents of that type, above and beyond the basic syntactical constraints imposed by XML itself...
of the document we wish to bind and members
Instance variable
In object-oriented programming with classes, an instance variable is a variable defined in a class , for which each object of the class has a separate copy. They live in memory for the life of the object....
of a class
Class (computer science)
In object-oriented programming, a class is a construct that is used as a blueprint to create instances of itself – referred to as class instances, class objects, instance objects or simply objects. A class defines constituent members which enable these class instances to have state and behavior...
to be represented in memory.
When this process is applied to convert an XML document to an object, it is called unmarshalling
Serialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
. The reverse process, to serialize an object as XML, is called marshalling
Marshalling (computer science)
In computer science, marshalling is the process of transforming the memory representation of an object to a data format suitable for storage or transmission...
.
Since XML is inherently sequential and objects are (usually) not, XML data binding mappings often have difficulty preserving all the information in an XML document. Specifically, information like comment
Comment
A comment is generally a verbal or written remark often related to an added piece of information, or an observation or statement. These are usually marked with an abbreviation, such as "obs." or "N.B."...
s, XML entity references, and sibling order may fail to be preserved in the object representation created by the binding application. This is not always the case; sufficiently complex data binders are capable of preserving 100% of the information in an XML document.
Similarly, since objects in computer memory are not inherently sequential, and may include links to other objects (including self-referential links), XML data binding mappings often have difficulty preserving all the information about an object when it is marshalled to XML.
An alternative approach to automatic data binding relies instead on hand-crafted XPath
XPath
XPath is a language for selecting nodes from an XML document. In addition, XPath may be used to compute values from the content of an XML document...
expressions that extract the data from XML. This approach has a number of benefits. First, the data binding code only needs proximate knowledge (e.g., topology, tag names, etc.) of the XML tree structure, which developers can determine by looking at the XML data; XML schemas are no longer mandatory. Furthermore, XPath allows the application to bind the relevant data items and filter out everything else, avoiding the unnecessary processing that would be required to completely unmarshall the entire XML document. The drawback of this approach is the lack of automation in implementing the object model and XPath expressions. Instead the application developers have to create these artifacts manually.
Data binding in general
One of XML data binding's strengths is the ability to un/serializeSerialization
In computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
objects across programs, languages, and platforms. You can dump a time series of structured objects from a datalogger written in C on an embedded processor, bring it across the network to process in perl and finally visualize in Mathematica
Mathematica
Mathematica is a computational software program used in scientific, engineering, and mathematical fields and other areas of technical computing...
. The structure and the data remain consistent and coherent throughout the journey, and no custom formats or parsing is required. This is not unique to XML. YAML
YAML
YAML is a human-readable data serialization format that takes concepts from programming languages such as C, Perl, and Python, and ideas from XML and the data format of electronic mail . YAML was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki...
, for example, is emerging as a powerful data binding alternative to XML. JSON
JSON
JSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...
(which can be regarded as a subset of YAML) is often suitable for lightweight or restricted applications.
External links
- XML Data Binding Resources, by Ronald Bourret
- XML Schema Patterns for Databinding Working Group
See also
- Bound control
- Data structureData structureIn computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...
- JAXBJava Architecture for XML BindingJava Architecture for XML Binding allows Java developers to map Java classes to XML representations. JAXB provides two main features: the ability to marshal Java objects into XML and the inverse, i.e. to unmarshal XML back into Java objects...
- JiBXJiBXJiBX is an open source Java framework for XML data binding. It solves the same problem as JAXB, XMLBeans and JDOM , but works differently. It lets developers work with data from XML documents using Plain Old Java Objects . The JiBX framework uses a binding definition to tell it how the Java...
- JSONJSONJSON , or JavaScript Object Notation, is a lightweight text-based open standard designed for human-readable data interchange. It is derived from the JavaScript scripting language for representing simple data structures and associative arrays, called objects...
- SerializationSerializationIn computer science, in the context of data storage and transmission, serialization is the process of converting a data structure or object state into a format that can be stored and "resurrected" later in the same or another computer environment...
- YAMLYAMLYAML is a human-readable data serialization format that takes concepts from programming languages such as C, Perl, and Python, and ideas from XML and the data format of electronic mail . YAML was first proposed by Clark Evans in 2001, who designed it together with Ingy döt Net and Oren Ben-Kiki...