Archive format
Encyclopedia
An archive format is the file format
File format
A file format is a particular way that information is encoded for storage in a computer file.Since a disk drive, or indeed any computer storage, can store only bits, the computer must have some way of converting information to 0s and 1s and vice-versa. There are different kinds of formats for...

 of an archive file
Archive file
An archive file is a file that is composed of one or more files along with metadata that can include source volume and medium information, file directory structure, error detection and recovery information, file comments, and usually employs some form of lossless compression. Archive files may be...

. The archive format is determined by the file archiver
File archiver
A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage...

. Some archive formats are well-defined by their authors and have become conventions supported by multiple vendors and/or open-source communities.

Archive formats support features such as file concatenation, data compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

, encryption
Encryption
In cryptography, encryption is the process of transforming information using an algorithm to make it unreadable to anyone except those possessing special knowledge, usually referred to as a key. The result of the process is encrypted information...

, file spanning
File spanning
File Spanning is a term used to describe the ability to package a single file or data stream into separate files of a specified size. It also implies the ability to re-combine the package files back into the original file or data stream....

, parity
Parity bit
A parity bit is a bit that is added to ensure that the number of bits with the value one in a set of bits is even or odd. Parity bits are used as the simplest form of error detecting code....

/Cyclic redundancy check
Cyclic redundancy check
A cyclic redundancy check is an error-detecting code commonly used in digital networks and storage devices to detect accidental changes to raw data...

, checksum
Checksum
A checksum or hash sum is a fixed-size datum computed from an arbitrary block of digital data for the purpose of detecting accidental errors that may have been introduced during its transmission or storage. The integrity of the data can be checked at any later time by recomputing the checksum and...

, self-extraction, self-installation, volume and directory structure
Directory structure
In computing, a directory structure is the way an operating system's file system and its files are displayed to the user. Files are typically displayed in a Hierarchical tree structure.-File names and extensions:...

 information, package notes/description, and other meta-data.

Types of Archive Formats

  • Archiving only formats only concatenate files.
  • Compression only formats only compress files.
  • Multi-function formats can concatenate, compress, encrypt, create error detection and recovery information, and repackage the archive into self-extracting/self-expanding files.
  • Software Packaging formats are used to create software packages
    Software package (installation)
    In package management systems, which are commonly used with Linux-based operating systems, a package is a specific piece of software which the system can install and uninstall....

     that may be self-installing files.
  • Disk Image formats are used to create disk image
    Disk image
    A disk image is a single file or storage device containing the complete contents and structure representing a data storage medium or device, such as a hard drive, tape drive, floppy disk, CD/DVD/BD, or USB flash drive, although an image of an optical disc may be referred to as an optical disc image...

    s or optical disk images of mass storage volumes.

Examples

Note: a comprehensive List of archive formats and Comparison of archive formats
Comparison of archive formats
There are many popular computer data archive formats for creating and maintaining archive files. The tables below compare many popular archive formats.-Purpose:The earliest use of archive formats was for backup, mobility, and archiving....

 is available.

By Operating System

Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s utilize the tar file format, ar
Ar (Unix)
The archiver is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses; it can be used to create archives for any purpose, but has been largely replaced by tar for...

, and shar to concatenate files. These archive formats can then be compressed into gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

 format.

On Windows
Microsoft Windows
Microsoft Windows is a series of operating systems produced by Microsoft.Microsoft introduced an operating environment named Windows on November 20, 1985 as an add-on to MS-DOS in response to the growing interest in graphical user interfaces . Microsoft Windows came to dominate the world's personal...

 platforms, the most widely-used archive format is ZIP; other formats are CAB
Cabinet (file format)
In computing, CAB is the Microsoft Windows native compressed archive format. It supports compression and digital signing, and is used in a variety of Microsoft installation engines: Setup API, Device Installer, AdvPack and Windows Installer.Though Cabinet was originally called Diamond, its .CAB...

, RAR (file format)
RAR (file format)
RAR stands for Roshal ARchive. It is a proprietary archive file format that supports data compression, error recovery, and file spanning...

, and ACE
ACE (file format)
In computing, ACE is a proprietary data compression archive file format developed by Marcel Lemke, and later bought by e-merge GmbH. The peak of its popularity was 1999—2001, when it provided slightly better compression rates than RAR, which has since become more popular.-WinAce:WinAce, maintained...

. Windows Installer
Windows Installer
The Windows Installer is a software component used for the installation, maintenance, and removal of software on modern Microsoft Windows systems...

 is a high-level archive format for distribution of software.

On Amiga
Amiga
The Amiga is a family of personal computers that was sold by Commodore in the 1980s and 1990s. The first model was launched in 1985 as a high-end home computer and became popular for its graphical, audio and multi-tasking abilities...

 computers the standard archive format is LHA
LHA (file format)
LHA is a freeware compression utility and associated file format. It was created in 1988 by , and originally named LHarc. A complete rewrite of LHarc, tentatively named LHx, was eventually released as LH. It was then renamed to LHA to avoid conflicting with the then-new MS-DOS 5.0 LH command...

.

on Apple Macintosh computers ZIP is now natively used in recent Mac OS X (10.3+), though StuffIt
StuffIt
StuffIt is a family of computer software utilities for archiving and compressing files on the Macintosh and Microsoft Windows platforms: it was originally produced for the Macintosh. An old version for Linux and Sun Solaris 2.7 or later is also available...

 used to be the most common.

Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 often uses TAR
Tar (file format)
In computing, tar is both a file format and the name of a program used to handle such files...

, gz
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

, and RPM package manager
RPM Package Manager
RPM Package Manager is a package management system. The name RPM variously refers to the .rpm file format, files in this format, software packaged in such files, and the package manager itself...

, a Package management system
Package management system
In software, a package management system, also called package manager, is a collection of software tools to automate the process of installing, upgrading, configuring, and removing software packages for a computer's operating system in a consistent manner...

 for distribution of software.

Origins

Ubiquitous amongst Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 and Unix-like operating system
Operating system
An operating system is a set of programs that manage computer hardware resources and provide common services for application software. The operating system is the most important type of system software in a computer system...

s is the tar file format ("tape archive"). Originally intended for transferring files to and from tape, it is still used on disk-based storage to combine files before they are compressed.

Development

Historically, every major computer platform, every operating system, and every vendor had its own preferred archive format. Some formats became more commonly used because of licensing, feasibility, and popularity. Today the most common formats are supported by many platforms and vendors. New technologies continue to introduce new formats.

See also

  • List of archive formats
  • Comparison of archive formats
    Comparison of archive formats
    There are many popular computer data archive formats for creating and maintaining archive files. The tables below compare many popular archive formats.-Purpose:The earliest use of archive formats was for backup, mobility, and archiving....

  • File archiver
    File archiver
    A file archiver is a computer program that combines a number of files together into one archive file, or a series of archive files, for easier transportation or storage...

  • Archive file
    Archive file
    An archive file is a file that is composed of one or more files along with metadata that can include source volume and medium information, file directory structure, error detection and recovery information, file comments, and usually employs some form of lossless compression. Archive files may be...

  • Container format (digital), a similar concept in media files
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK