LHA (file format)
LHA is a freeware
Freeware is computer software that is available for use at no cost or for an optional fee, but usually with one or more restricted usage rights. Freeware is in contrast to commercial software, which is typically sold for profit, but might be distributed for a business or commercial purpose in the...

Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

 utility and associated file format. It was created in 1988 by , and originally named LHarc. A complete rewrite of LHarc, tentatively named LHx, was eventually released as LH. It was then renamed to LHA to avoid conflicting with the then-new MS-DOS
MS-DOS is an operating system for x86-based personal computers. It was the most commonly used member of the DOS family of operating systems, and was the main operating system for IBM PC compatible personal computers during the 1980s to the mid 1990s, until it was gradually superseded by operating...

 5.0 LH ("load high") command. According to early documentation, LHA is pronounced like La.

Although no longer much used in the West, LHA remains popular in Japan
Japan is an island nation in East Asia. Located in the Pacific Ocean, it lies to the east of the Sea of Japan, China, North Korea, South Korea and Russia, stretching from the Sea of Okhotsk in the north to the East China Sea and Taiwan in the south...

. It was used by id Software
Id Software
Id Software is an American video game development company with its headquarters in Richardson, Texas. The company was founded in 1991 by four members of the computer company Softdisk: programmers John Carmack and John Romero, game designer Tom Hall, and artist Adrian Carmack...

 to compress installation files for their earlier games, including Doom and Quake. LHA has been ported to many operating systems, and is still the main archiving format used on the Amiga
The Amiga is a family of personal computers that was sold by Commodore in the 1980s and 1990s. The first model was launched in 1985 as a high-end home computer and became popular for its graphical, audio and multi-tasking abilities...

 computer, although it was briefly replaced by LZX in the mid 90s. This was due to Aminet
Aminet is the world's largest archive of Amiga-related software and files. Aminet was originally hosted by several universities' FTP sites, and is now available on CD-ROM and on the web.-History:...

, the world's largest archive of Amiga related software and files, standardising on Stefan Boberg's implementation of LHA for the Amiga. Microsoft has released a Windows XP add-on, Microsoft Compressed (LZH) Folder Add-on, designed for the Japanese version of the operating system. The Japanese version of Windows 7 ships with the LZH folder add-on built-in. Users of non-Japanese versions of Windows 7 Enterpise and Ultimate can also install the LZH folder add-on by installing the optional Japanese language pack from Windows Update
Windows Update
Windows Update is a service provided by Microsoft that provides updates for the Microsoft Windows operating system and its installed components, including Internet Explorer...


Compression methods

In an LZH archive, the compression method is stored as a 5-byte text string. These are the third through seventh bytes of the file.

Canonical LZH

LHarc compresses files using an algorithm from Yoshizaki's earlier LZHUF product, which was modified from LZARI developed by , but uses Huffman coding
Huffman coding
In computer science and information theory, Huffman coding is an entropy encoding algorithm used for lossless data compression. The term refers to the use of a variable-length code table for encoding a source symbol where the variable-length code table has been derived in a particular way based on...

 instead of arithmetic coding
Arithmetic coding
Arithmetic coding is a form of variable-length entropy encoding used in lossless data compression. Normally, a string of characters such as the words "hello there" is represented using a fixed number of bits per character, as in the ASCII code...

. LZARI uses Lempel-Ziv-Storer-Szymanski with arithmetic coding.


This method is introduced in LHarc version 1.

It supports 4KiByte sliding window, with support of maximum 60 bytes of matching length. Dynamic Huffman encoding is used.

-lh4-, -lh5-, -lh6, -lh7-

Methods 4, 5, 6, 7 support 4, 8, 32, 64 KiByte sliding window respectively, with support of maximum 256 bytes of matching length. Static Huffman encoding is used. lh5 is first introduced in LHarc 2, followed by lh6 in LHA 2.66 (MSDOS), lh7 in LHA 2.67 beta (MSDOS). LHA itself never compresses into lh4.


Technically it is not a compression method, but it is used in .LZH archive to indicate the compressed object is an empty directory.

-lh8-, -lh9-, -lha-, -lhb-, -lhc-, -lhe-

Dictionary sizes are 64, 128, 256, 512, 1024, 2048 Ki bytes respectively.

PMarc extensions

These compression methods are created by PMarc, an CP/M
CP/M was a mass-market operating system created for Intel 8080/85 based microcomputers by Gary Kildall of Digital Research, Inc...

 archiver created by Miyo. The archive usually has a .PMA extension.

LArc extensions

LArc uses the same file format as .LZH, but was written by Kazuhiko Miki, Haruhiko Okumura, Ken Masuyama, with extension name '.LZS'.http://www.lzh-zip.com/extension/ext31.html


It supports 2KiByte sliding window, with support of maximum 17 bytes of matching length.


It supports 4KiByte sliding window, with support of maximum 17 bytes of matching length.


There are copies of LHICE marked as version 1.14. According to Okumura, LHICE is not written by Yoshi.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.