Internationalization Tag Set
Encyclopedia
The Internationalization Tag Set (ITS) is a set of attributes and elements designed to provide internationalization and localization
support in XML
documents.
The ITS specification identifies concepts (called "ITS data categories") which are important for internationalization and localization. It also defines implementations of these concepts through a set of elements and attributes grouped in the ITS namespace. XML developers can use this namespace to integrate internationalization features directly into their own XML schemas and documents.
The vocabulary is designed to work on two different fronts: First by providing markup usable directly in the XML documents. Secondly, by offering a way to indicate if there are parts of a given markup that correspond to some of the ITS data categories and should be treated as such by ITS processors.
ITS applies to both new document types as well as existing ones. It also applies to both markups without any internationalization features as well as the class of documents already supporting some internationalization or localization-related functions.
ITS can be specified using global rules and local rules.
The
Note also the use of the
One reason is that the version 1.0 does not have data categories for everything. For example, there is currently no way to indicate a relation source/target in bilingual files where some parts of a document store the source text and some other parts the corresponding translation.
The other reason is that many aspects of internationalization cannot be resolved with a markup. They have to do with the design of the DTD or the schema itself. There are best practices, design and authoring guidelines that are necessary to follow to make sure documents are correctly internationalized and easy to localize. For example, using attributes to store translatable text is a bad idea for many different reasons, but ITS cannot prevent an XML developer to make such choice.
Internationalization and localization
In computing, internationalization and localization are means of adapting computer software to different languages, regional differences and technical requirements of a target market...
support in XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
documents.
The ITS specification identifies concepts (called "ITS data categories") which are important for internationalization and localization. It also defines implementations of these concepts through a set of elements and attributes grouped in the ITS namespace. XML developers can use this namespace to integrate internationalization features directly into their own XML schemas and documents.
Overview
ITS v1.0 includes seven data categories:- Translate: Defines what parts of a document are translatable or not.
- Localization Note: Provides alerts, hints, instructions, and other information to help the localizers or the translators.
- Terminology: Indicates parts of the documents that are terms and optionally pointers to information about these terms.
- Directionality: Indicates what type of display directionality should be applied to parts of the document.
- Ruby: Indicates what parts of the document should be displayed as ruby text. (RubyRuby characterare small, annotative glosses that can be placed above or to the right of a Chinese character when writing languages with logographic characters such as Chinese or Japanese to show the pronunciation...
is a short run of text alongside a base text, typically used in East Asian documents to indicate pronunciation or to provide a brief annotation). - Language Information: Identifies the language of the different parts of the document.
- Elements Within Text: Indicates how elements should be treated with regard to linguistic segmentation.
The vocabulary is designed to work on two different fronts: First by providing markup usable directly in the XML documents. Secondly, by offering a way to indicate if there are parts of a given markup that correspond to some of the ITS data categories and should be treated as such by ITS processors.
ITS applies to both new document types as well as existing ones. It also applies to both markups without any internationalization features as well as the class of documents already supporting some internationalization or localization-related functions.
ITS can be specified using global rules and local rules.
- The global rules are expressed anywhere in the document (embedded global rules), or even outside the document (external global rules), using the
its:rules
element. - The local rules are expressed by specialized attributes (and sometimes elements) specified inside the document instance, at the location where they apply.
Translate data category
In the following ITS markup example, the elements and attributes with theits
prefix are part of the ITS namespace. The its:rules
element list the different rules to apply to this file. There is one its:translateRule
rule that indicates that any content inside the head
element should not be translated.The
its:translate
attributes used in some elements are utilised to override the global rule. Here, to make translatable the content of title
and to make non-translatable the text "faux pas".Localization Note data category
In the following ITS markup example, theits:locNote
element specifies that any node corresponding to the XPath expressionXPath
XPath is a language for selecting nodes from an XML document. In addition, XPath may be used to compute values from the content of an XML document...
"//msg/data"
has an associated note. The location of that note is expressed by the locNotePointer
attribute, which holds a relative XPath expression pointing to the node where the note is, here ="../notes"
.Note also the use of the
its:translate
attribute to mark the notes
elements as non-translatable.ITS limitations
ITS does not have a solution to all XML internationalization and localization issues.One reason is that the version 1.0 does not have data categories for everything. For example, there is currently no way to indicate a relation source/target in bilingual files where some parts of a document store the source text and some other parts the corresponding translation.
The other reason is that many aspects of internationalization cannot be resolved with a markup. They have to do with the design of the DTD or the schema itself. There are best practices, design and authoring guidelines that are necessary to follow to make sure documents are correctly internationalized and easy to localize. For example, using attributes to store translatable text is a bad idea for many different reasons, but ITS cannot prevent an XML developer to make such choice.