TeX4ht
Encyclopedia
TeX4ht is a configurable converter capable of translating TeX
TeX
TeX is a typesetting system designed and mostly written by Donald Knuth and released in 1978. Within the typesetting system, its name is formatted as ....

 and LaTeX
LaTeX
LaTeX is a document markup language and document preparation system for the TeX typesetting program. Within the typesetting system, its name is styled as . The term LaTeX refers only to the language in which documents are written, not to the editor used to write those documents. In order to...

 documents to HTML
HTML
HyperText Markup Language is the predominant markup language for web pages. HTML elements are the basic building-blocks of webpages....

 and certain XML
XML
Extensible Markup Language is a set of rules for encoding documents in machine-readable form. It is defined in the XML 1.0 Specification produced by the W3C, and several other related specifications, all gratis open standards....

 formats. Most notably, TeX4ht serves for converting (La)TeX documents to formats used by word processor
Word processor
A word processor is a computer application used for the production of any sort of printable material....

s. It was developed by Eitan M. Gurari.

The program is published under the LaTeX Project Public License (LPPL).

History

TeX4ht was developed in the 1990s to convert (La)TeX to HTML, helping to publish scientific documents that were written in (La)TeX on the world wide web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...

 for display in a web browser
Web browser
A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

. Particularly, hypertext
Hypertext
Hypertext is text displayed on a computer or other electronic device with references to other text that the reader can immediately access, usually by a mouse click or keypress sequence. Apart from running text, hypertext may contain tables, images and other presentational devices. Hypertext is the...

 features were supported, so it became possible to include hyperlinks in the web version of documents.

More XML-based formats were supported gradually. As of 2010, XHTML
XHTML
XHTML is a family of XML markup languages that mirror or extend versions of the widely-used Hypertext Markup Language , the language in which web pages are written....

, MathML
MathML
Mathematical Markup Language is an application of XML for describing mathematical notations and capturing both its structure and content. It aims at integrating mathematical formulae into World Wide Web pages and other documents...

, OpenDocument
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....

, DocBook
DocBook
DocBook is a semantic markup language for technical documentation. It was originally intended for writing technical documents related to computer hardware and software but it can be used for any other sort of documentation....

, and TEI
Text Encoding Initiative
The Text Encoding Initiative is a text-centric community of practice in the academic field of digital humanities. The community runs a mailing list, meetings and conference series, and maintains a technical standard, a wiki and a toolset....

 are supported. JavaHelp
JavaHelp
JavaHelp refers to both an application and a format for online help files that can be displayed by the JavaHelp browser. It is written in Java, and is mainly used in Java applications...

 can also be generated.

TeX4ht is now included preconfigured with all TeX distributions.

Since Eitan M. Gurari's death the program has been maintained by Radhakrishnan CV and Karl Berry.

Function

TeX4ht does not directly transform TeX or LaTeX markup into the output markup language
Markup language
A markup language is a modern system for annotating a text in a way that is syntactically distinguishable from that text. The idea and terminology evolved from the "marking up" of manuscripts, i.e. the revision instructions by editors, traditionally written with a blue pencil on authors' manuscripts...

 (HTML etc.) Instead, an ordinary (La)TeX run compiles a DVI file from the source first. TeX4ht subsequently processes the DVI file. Other converters, most notably LaTeX2HTML
LaTeX2HTML
LaTeX2HTML is a converter written in Perl that converts LaTeX documents to HTML. This way, e.g., scientific papers—primarily typeset for printing—can be put on the Web for online viewing....

 or TtH
TTH
TTH may refer to:* .224 TTH , a Wildcat firearm cartridge * Tiger Tree Hash in programming* Taylor, Taylor and Hobson, an optical company which became Taylor-Hobson...

 operate in a single pass.

TeX4ht essentially can deal with any successfully compiling (La)TeX document source. TeX4ht can also incorporate support publicly available macro package
Software package (installation)
In package management systems, which are commonly used with Linux-based operating systems, a package is a specific piece of software which the system can install and uninstall....

s or user-made (perhaps document-specific) commands to process features that transcend standard TeX formats, such as for managing bibliography with BibTeX
BibTeX
BibTeX is reference management software for formatting lists of references. The BibTeX tool is typically used together with the LaTeX document preparation system...

, because these extensions do not need corresponding implementations in the converter.

Mathematical formulae and other characters or symbols that cannot be displayed as text are converted into graphics.

TeX4ht can convert LaTeX documents into Microsoft Word
Microsoft Word
Microsoft Word is a word processor designed by Microsoft. It was first released in 1983 under the name Multi-Tool Word for Xenix systems. Subsequent versions were later written for several other platforms including IBM PCs running DOS , the Apple Macintosh , the AT&T Unix PC , Atari ST , SCO UNIX,...

's doc format via the OpenDocument
OpenDocument
The Open Document Format for Office Applications is an XML-based file format for representing electronic documents such as spreadsheets, charts, presentations and word processing documents....

format, ODT.

External links


Literature

  • Translating LaTeX to HTML using TeX4ht, in: Michel Goossens, Sebastian Rahtz, Eitan M. Gurari, Ross Moore, Robert S. Sutor. The LaTeX Web Companion. Integrating TeX, HTML, and XML. 1999. 8th printing January 2006. pp. 155–194.
  • Eitan Gurari, HTML Production, TUGBoat 25 (2004), No. 1, pp. 39–47.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK