Canterbury Corpus
Encyclopedia
The Canterbury Corpus is a collection of files
Computer file
A computer file is a block of arbitrary information, or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished...

 intended for use as a benchmark for testing lossless data compression
Lossless data compression
Lossless data compression is a class of data compression algorithms that allows the exact original data to be reconstructed from the compressed data. The term lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed, in exchange...

 algorithms. It was created in 1997 at the University of Canterbury
University of Canterbury
The University of Canterbury , New Zealand's second-oldest university, operates its main campus in the suburb of Ilam in the city of Christchurch, New Zealand...

, New Zealand
New Zealand
New Zealand is an island country in the south-western Pacific Ocean comprising two main landmasses and numerous smaller islands. The country is situated some east of Australia across the Tasman Sea, and roughly south of the Pacific island nations of New Caledonia, Fiji, and Tonga...

 and designed to replace the Calgary Corpus
Calgary Corpus
The Calgary Corpus is a collection of text and binary data files, commonly used for comparing data compression algorithms. It was created by Ian Witten, Tim Bell and John Cleary from the University of Calgary in 1987 and was commonly used in the 1990s...

.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK