Okapi Framework
Encyclopedia
The Okapi Framework is a cross-platform and open-source set of components and applications that offer extensive support for localizing and translating documentation and software.
Architecture
The Okapi Framework is organized around the following parts:- Interface Specifications — The framework's components and applications communicate through several common API sets: the interfaces. A few of them are defined as high-level specifications. Implementing these interfaces allows you to seamlessly plug new components in the overall framework. For example: all filters have the same API to parse input files, so you can write utilities that uses any of the available filters.
- Format Specifications — Storing and exchanging data is an important part of the localization process. Using open standards for as many formats as possible increases interoperability. Whenever possible the Okapi Framework make use of existing standards such as XLIFFXLIFFXLIFF is an XML-based format created to standardize localization. XLIFF was standardized by OASIS in 2002. Its current specification is v1.2 released on Feb-1-2008....
, SRXSRX Segmentation Rules eXchage LISA OSCAR XML based StandardSegmentation Rules eXchange or is an XML-based standard that was maintained by Localization Industry Standards Association, until it became insolvent in 2011 and then GALA....
, TMXTranslation Memory eXchangeTMX is an open XML standard for the exchange of translation memory data created by computer-aided translation and localization tools....
, etc.
- Components — The Okapi Framework also includes a growing set of components that implement the different interface specifications. Some are basic and low-level parts that can be re-used when programming more high-level components, while others are plug-ins that can be used directly in scripts or applications.
- Applications — Lastly, the framework also provides end-user applications that can be utilized out-of-the-box. These tools are making use of the Okapi components and provide ready-made platforms for plugging in your own components.
Components
There are two main types of components:- Filters — Several filters components are implemented, including for: HTML, OpenOffice.org, Microsoft Office files, Java properties files, .NET ResX files, Table-type files (e.g. CSV), Gettext POGettextIn computing, gettext is an internationalization and localization system commonly used for writing multilingual programs on Unix-like computer operating systems. The most commonly-used implementation of gettext is GNU gettext, released by the GNU Project in 1995.- History :gettext was originally...
files, XLIFFXLIFFXLIFF is an XML-based format created to standardize localization. XLIFF was standardized by OASIS in 2002. Its current specification is v1.2 released on Feb-1-2008....
, TMXTranslation Memory eXchangeTMX is an open XML standard for the exchange of translation memory data created by computer-aided translation and localization tools....
, Qt TSQt (toolkit)Qt is a cross-platform application framework that is widely used for developing application software with a graphical user interface , and also used for developing non-GUI programs such as command-line tools and consoles for servers...
files, regular-expression-basedRegular expressionIn computing, a regular expression provides a concise and flexible means for "matching" strings of text, such as particular characters, words, or patterns of characters. Abbreviations for "regular expression" include "regex" and "regexp"...
formats, XMLXMLExtensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....
format (including support of the Internationalization Tag SetInternationalization Tag SetThe ' is a set of attributes and elements designed to provide internationalization and localization support in XML documents.The ITS specification identifies concepts which are important for internationalization and localization. It also defines implementations of these concepts through a set of...
), etc.
- Utilities — Several utilities components are implemented, including: Text extraction and merging, RTF to text conversion, encoding conversionCharacter encodingA character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of natural numbers, octets or electrical pulses, in order to facilitate the transmission of data through telecommunication networks or storage of text in...
, line-break conversion, term extraction, translation comparison, quality check, pseudo-translation, text re-writing, etc.
Applications
Some of the applications using the framework are:- Rainbow — an application that provides a simple user interface to launch any of the Okapi utilities components.
- Tikal — a command-line tool allowing you to execute any Okapi utilities from the DOS prompt or a batch file.
- Ratel — an application to create and modify segmentation rules in SRX format.
- CheckMate — an application to perform quality checks in translated documents.