Zlib
Encyclopedia
zlib is a software library
Library (computer science)
In computer science, a library is a collection of resources used to develop software. These may include pre-written code and subroutines, classes, values or type specifications....

 used for data compression
Data compression
In computer science and information theory, data compression, source coding or bit-rate reduction is the process of encoding information using fewer bits than the original representation would use....

. zlib was written by Jean-Loup Gailly
Jean-Loup Gailly
Jean-Loup Gailly - is an author of gzip. He wrote the compression code of the portable archiver of the Info-ZIP and the tools compatible with the PKZIP archiver for MS-DOS...

 and Mark Adler
Mark Adler
Dr. Mark Adler may be best known for his work in the field of data compression. Adler is the author of the Adler-32 hash function, a co-author of the zlib compression library and gzip, has contributed to Info-ZIP, and has participated in developing the Portable Network Graphics image format...

 and is an abstraction
Abstraction (computer science)
In computer science, abstraction is the process by which data and programs are defined with a representation similar to its pictorial meaning as rooted in the more complex realm of human life and language with their higher need of summarization and categorization , while hiding away the...

 of the DEFLATE
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

 compression algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 used in their gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

 file compression program. Zlib is also a crucial component of many software platforms including Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

, Mac OS X
Mac OS X
Mac OS X is a series of Unix-based operating systems and graphical user interfaces developed, marketed, and sold by Apple Inc. Since 2002, has been included with all new Macintosh computer systems...

, and the iOS. It has been also used in gaming consoles such as the Playstation 3
PlayStation 3
The is the third home video game console produced by Sony Computer Entertainment and the successor to the PlayStation 2 as part of the PlayStation series. The PlayStation 3 competes with Microsoft's Xbox 360 and Nintendo's Wii as part of the seventh generation of video game consoles...

, Wii
Wii
The Wii is a home video game console released by Nintendo on November 19, 2006. As a seventh-generation console, the Wii primarily competes with Microsoft's Xbox 360 and Sony's PlayStation 3. Nintendo states that its console targets a broader demographic than that of the two others...

, and Xbox 360
Xbox 360
The Xbox 360 is the second video game console produced by Microsoft and the successor to the Xbox. The Xbox 360 competes with Sony's PlayStation 3 and Nintendo's Wii as part of the seventh generation of video game consoles...

.

The first public version of zlib, 0.9, was released on 1 May 1995 and was originally intended for use with libpng
Libpng
libpng is the official Portable Network Graphics reference library . It is a platform-independent library that contains C functions for handling PNG images...

 image library. It is free software
Free software
Free software, software libre or libre software is software that can be used, studied, and modified without restriction, and which can be copied and redistributed in modified or unmodified form either without restriction, or with restrictions that only ensure that further recipients can also do...

, distributed under the zlib license
Zlib License
The zlib License is a permissive free software license which defines the terms under which the zlib and libpng software libraries can be distributed. It is also used by other free software packages....

.

Encapsulation

zlib compressed data is typically written with a gzip wrapper or a zlib wrapper. The wrapper encapsulates the raw DEFLATE
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

 data by adding a header and trailer. This provides stream identification and error detection which are not provided by the raw DEFLATE data.

The gzip header is larger than the zlib header as it stores a file name and other file system information. This is the header format used in the ubiquitous gzip
Gzip
Gzip is any of several software applications used for file compression and decompression. The term usually refers to the GNU Project's implementation, "gzip" standing for GNU zip. It is based on the DEFLATE algorithm, which is a combination of Lempel-Ziv and Huffman coding...

 file format.

Algorithm

zlib only supports one algorithm called DEFLATE
DEFLATE
Deflate is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in RFC 1951....

 which is a variation of LZ77 (Lempel–Ziv 1977)

This algorithm provides good compression on a wide variety of data with minimal use of system resources. This is also the algorithm used in the ZIP archive format
ZIP (file format)
Zip is a file format used for data compression and archiving. A zip file contains one or more files that have been compressed, to reduce file size, or stored as is...

.

It is unlikely that the zlib format will ever be extended to use any other algorithms, though the header makes allowance for this possibility.

Resource use

The library provides facilities for control of processor and memory use

A compression level value may be supplied which trades-off speed with compression.

There are also facilities for conserving memory. These are probably only useful in restricted memory environments such as some embedded systems.

Strategy

The compression can be optimized for specific types of data

If you are using the library to always compress specific types of data then using a specific strategy may improve compression and performance. For example, if your data contains long lengths of repeated bytes then the RLE (run-length encoding
Run-length encoding
Run-length encoding is a very simple form of data compression in which runs of data are stored as a single data value and count, rather than as the original run...

) strategy may give good results at higher speed.

For general data, the default strategy is preferred.

Error handling

Errors may be detected and skipped.

Data corruption can be detected (as long as the data is written with a zlib or gzip header - see above).

Further, if full-flush points are written to the compressed stream then corrupt data can be skipped and the decompression will resynchronise at the next flush point. (No error recovery of the corrupt data is provided.) Full-flush points are useful for large data streams on unreliable channels where some last data loss is unimportant (e.g. multimedia), however creating too many flush points can dramatically affect speed and compression.

Data length

There is no limit to the length of data that can be compressed or decompressed.

Repeated calls to the library allow an unlimited numbers of blocks of data to be handled. Some ancillary code (counters) may suffer from overflow for long data streams but this does not affect the actual compression or decompression.

When compressing a long (or infinite) data stream it would be advisable to write regular full-flush points.

Applications

Today, zlib is something of a de facto
De facto
De facto is a Latin expression that means "concerning fact." In law, it often means "in practice but not necessarily ordained by law" or "in practice or actuality, but not officially established." It is commonly used in contrast to de jure when referring to matters of law, governance, or...

standard
Standardization
Standardization is the process of developing and implementing technical standards.The goals of standardization can be to help with independence of single suppliers , compatibility, interoperability, safety, repeatability, or quality....

, to the point that zlib and DEFLATE are often used interchangeably in standards documents. Thousands of applications rely on it for compression, directly or indirectly, including:
  • The Linux kernel, where it is used to implement compressed network protocols, compressed file system
    File system
    A file system is a means to organize data expected to be retained after a program terminates by providing procedures to store, retrieve and update data, as well as manage the available space on the device which contain it. A file system organizes data in an efficient manner and is tuned to the...

    s and to decompress the kernel image itself at boot time.
  • libpng
    Libpng
    libpng is the official Portable Network Graphics reference library . It is a platform-independent library that contains C functions for handling PNG images...

    , the reference implementation for the PNG image format, which specifies DEFLATE as the stream compression for its bitmap
    Bitmap
    In computer graphics, a bitmap or pixmap is a type of memory organization or image file format used to store digital images. The term bitmap comes from the computer programming terminology, meaning just a map of bits, a spatially mapped array of bits. Now, along with pixmap, it commonly refers to...

     data.
  • Libwww
    Libwww
    libwww is a highly-modular client-side web API for Unix and Windows, and is also the name of the reference implementation of this API....

    , an API for web applications like web browser
    Web browser
    A web browser is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier and may be a web page, image, video, or other piece of content...

  • The Apache HTTP server
    Apache HTTP Server
    The Apache HTTP Server, commonly referred to as Apache , is web server software notable for playing a key role in the initial growth of the World Wide Web. In 2009 it became the first web server software to surpass the 100 million website milestone...

    , which uses zlib to implement HTTP/1.1
    Http compression
    HTTP compression is a capability that can be built into web servers and web clients to make better use of available bandwidth , and provide faster transmission speeds between both...

    .
  • The OpenSSH
    OpenSSH
    OpenSSH is a set of computer programs providing encrypted communication sessions over a computer network using the SSH protocol...

     client and server, which rely on zlib to perform the optional compression offered by the Secure Shell
    Secure Shell
    Secure Shell is a network protocol for secure data communication, remote shell services or command execution and other secure network services between two networked computers that it connects via a secure channel over an insecure network: a server and a client...

     protocol.
  • The OpenSSL
    OpenSSL
    OpenSSL is an open source implementation of the SSL and TLS protocols. The core library implements the basic cryptographic functions and provides various utility functions...

     and GnuTLS
    GnuTLS
    GnuTLS , the GNU Transport Layer Security Library, is a free software implementation of the SSL and TLS protocols. Its purpose is to offer an application programming interface for applications to enable secure communication protocols over their network transport layer.-Features:GnuTLS consists of...

     security libraries, which can optionally use zlib to compress TLS
    Transport Layer Security
    Transport Layer Security and its predecessor, Secure Sockets Layer , are cryptographic protocols that provide communication security over the Internet...

     connections.
  • The FFmpeg
    FFmpeg
    FFmpeg is a free software project that produces libraries and programs for handling multimedia data. The most notable parts of FFmpeg are libavcodec, an audio/video codec library used by several other projects, libavformat, an audio/video container mux and demux library, and the ffmpeg command line...

     multimedia library, which uses zlib to read and write the DEFLATE-compressed parts of stream formats such as Matroska
    Matroska
    The Matroska Multimedia Container is an open standard free container format, a file format that can hold an unlimited number of video, audio, picture or subtitle tracks in one file. It is intended to serve as a universal format for storing common multimedia content, like movies or TV shows...

    .
  • The rsync
    Rsync
    rsync is a software application and network protocol for Unix-like and Windows systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar...

     remote file synchronizer, which uses zlib to implement optional protocol compression.
  • The dpkg
    Dpkg
    dpkg is the software at the base of the Debian package management system. dpkg is used to install, remove, and provide information about .deb packages....

     and RPM
    RPM Package Manager
    RPM Package Manager is a package management system. The name RPM variously refers to the .rpm file format, files in this format, software packaged in such files, and the package manager itself...

     package managers, which use zlib to unpack files from compressed software packages.
  • The Subversion and CVS
    Concurrent Versions System
    The Concurrent Versions System , also known as the Concurrent Versioning System, is a client-server free software revision control system in the field of software development. Version control system software keeps track of all work and all changes in a set of files, and allows several developers ...

     version control systems, which use zlib to compress traffic to and from remote repositories.
  • The Git
    Git (software)
    Git is a distributed revision control system with an emphasis on speed. Git was initially designed and developed by Linus Torvalds for Linux kernel development. Every Git working directory is a full-fledged repository with complete history and full revision tracking capabilities, not dependent on...

     version control system uses zlib to store the contents of its data objects (blobs, trees, commits and tags).
  • The PostgreSQL
    PostgreSQL
    PostgreSQL, often simply Postgres, is an object-relational database management system available for many platforms including Linux, FreeBSD, Solaris, MS Windows and Mac OS X. It is released under the PostgreSQL License, which is an MIT-style license, and is thus free and open source software...

     RDBMS uses zlib with custom dump format (pg_dump -Fc) for database backups.

zlib is also used in many embedded devices such as the Apple Inc. iPhone
IPhone
The iPhone is a line of Internet and multimedia-enabled smartphones marketed by Apple Inc. The first iPhone was unveiled by Steve Jobs, then CEO of Apple, on January 9, 2007, and released on June 29, 2007...

 and Sony
Sony
, commonly referred to as Sony, is a Japanese multinational conglomerate corporation headquartered in Minato, Tokyo, Japan and the world's fifth largest media conglomerate measured by revenues....

 Playstation 3
PlayStation 3
The is the third home video game console produced by Sony Computer Entertainment and the successor to the PlayStation 2 as part of the PlayStation series. The PlayStation 3 competes with Microsoft's Xbox 360 and Nintendo's Wii as part of the seventh generation of video game consoles...

because the code is portable, liberally-licensed and has a relatively small memory footprint.

External links

  • zlib home page
  • RFC 1950—ZLIB Compressed Data Format
  • RFC 1951—DEFLATE Compressed Data Format
  • RFC 1952—GZIP file format
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK