Pearson hashing
Encyclopedia
Pearson hashing is a hash function
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

 designed for fast execution on processors with 8-bit register
Processor register
In computer architecture, a processor register is a small amount of storage available as part of a CPU or other digital processor. Such registers are addressed by mechanisms other than main memory and can be accessed more quickly...

s. Given an input consisting of any number of bytes, it produces as output a single byte that is strongly dependent on every byte of the input. Its implementation requires only a few instructions, plus a 256-byte lookup table
Lookup table
In computer science, a lookup table is a data structure, usually an array or associative array, often used to replace a runtime computation with a simpler array indexing operation. The savings in terms of processing time can be significant, since retrieving a value from memory is often faster than...

 containing a permutation
Permutation
In mathematics, the notion of permutation is used with several slightly different meanings, all related to the act of permuting objects or values. Informally, a permutation of a set of objects is an arrangement of those objects into a particular order...

 of the values 0 through 255.

This hash function is a CBC-MAC
CBC-MAC
In cryptography, a cipher block chaining message authentication code , is a technique for constructing a message authentication code from a block cipher. The message is encrypted with some block cipher algorithm in CBC mode to create a chain of blocks such that each block depends on the proper...

 that uses an 8-bit random block cipher implemented via the permutation table. An 8-bit
block cipher
Block cipher
In cryptography, a block cipher is a symmetric key cipher operating on fixed-length groups of bits, called blocks, with an unvarying transformation. A block cipher encryption algorithm might take a 128-bit block of plaintext as input, and output a corresponding 128-bit block of ciphertext...

 has negligible cryptographic security, so the Pearson hash function is not cryptographically strong; but it offers these benefits:
  • It is extremely simple.
  • It executes quickly on resource-limited processors.
  • There is no simple class of inputs for which collision
    Hash collision
    Not to be confused with wireless packet collision.In computer science, a collision or clash is a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest....

    s (identical outputs) are especially likely.
  • Given a small, privileged set of inputs (e.g., reserved word
    Reserved word
    Reserved words are one type of grammatical construct in programming languages. These words have special meaning within the language and are predefined in the language’s formal specifications...

    s for a compiler
    Compiler
    A compiler is a computer program that transforms source code written in a programming language into another computer language...

    ), the permutation table can be adjusted so that those inputs yield distinct hash values, producing what is called a perfect hash function
    Perfect hash function
    A perfect hash function for a set S is a hash function that maps distinct elements in S to a set of integers, with no collisions. A perfect hash function has many of the same applications as other hash functions, but with the advantage that no collision resolution has to be implemented.- Properties...

    .


The algorithm can be described by the following pseudocode
Pseudocode
In computer science and numerical computation, pseudocode is a compact and informal high-level description of the operating principle of a computer program or other algorithm. It uses the structural conventions of a programming language, but is intended for human reading rather than machine reading...

, which computes the hash of message C using the permutation table T:


h := 0
for each c in C loop
index := h xor c
h := T[index]
end loop
return h
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK