Round-trip format conversion
Encyclopedia
The term round-trip is commonly used in document conversion
Document conversion
Document conversion is the act of converting one document's format to another, which allows the document to be read in many more applications. Documents can be converted into* other source document formats* consumer formats* structured data- How it works :...

 particularly involving markup language
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

s such as XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 and SGML. A successful round-trip consists of converting a document in format A (docA) to one in format B (docB) and then back again to format A (docA′). If docA and docA′ are identical then there has been no information loss and the round-trip has been successful. More generally it means converting from any data representation and back again, including from one data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 to another.

Information loss

When a document in one format is converted to another there is likely to be information loss. For example, suppose an HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 document is saved as plain text
Plain text
In computing, plain text is the contents of an ordinary sequential file readable as textual material without much processing, usually opposed to formatted text....

 (*.txt). Then all the markup (structure, formatting, superscripts, …) will be lost. Compound documents will frequently lose information on images and other embedded objects. If the text file is converted back to the original format, information will necessarily be missing.

A similar effect happens with image formats. Some formats such as JPEG
JPEG
In computing, JPEG . The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality....

 achieve compression through small amount of information loss. If a lossless file
Lossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...

, such as a BMP or PNG file, is converted to JPEG and back again then the result will be different from the original (although it may be visually very similar).

Just because the initial and final documents are not bitwise identical does not mean there is information loss. Some formats have undefined fields, or fields where the contents have no impact on the result.

Markup languages

Markup languages such as XML can, in principle, hold any information and so the process docA → docX → docA' could be designed to avoid information loss. It is now common to convert legacy formats to XML formats because they have greater interoperability and a wider set of available tools. Thus it is possible to convert Word documents to an XML format and reimport them.

The XML document should contain identical information to the legacy format. An important condition is that the roundtrip (legacy → XML → legacy') should result in effectively identical documents. Because some document structures allow some flexibility in content order, whitespace, case-sensitivity, etc. it is useful to have a means of canonicalizing the legacy format. The full roundtrip may then be:
legacy → canonicalLegacy → XML → legacy′ → canonicalLegacy′


If canonicalLegacy = canonicalLegacy′ then the roundtrip has been successful.

Limitation

An application can claim to round-trip and be dishonest. For example, it may save the original data from docA as a field in docX, so the reverse transformation to docA′ simply extracts that field. While this may be needed for some cases, the idea of a round-trip conversion is to go through another format representation or data structure and back again.

Usage

The term appears to be common, but not reported in dictionaries. A typical usage occurs in http://mailman.ic.ac.uk/pipermail/xml-dev/1999-March/010781.html but the term is likely to have been used before this.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK