Linear probing
Encyclopedia
Linear probing is a scheme in computer programming
Computer programming
Computer programming is the process of designing, writing, testing, debugging, and maintaining the source code of computer programs. This source code is written in one or more programming languages. The purpose of programming is to create a program that performs specific operations or exhibits a...

 for resolving hash collision
Hash collision
Not to be confused with wireless packet collision.In computer science, a collision or clash is a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest....

s of values of hash function
Hash function
A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

s by sequentially searching the hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...

 for a free location. This is accomplished using two values - one as a starting value and one as an interval between successive values in modular arithmetic
Modular arithmetic
In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus....

. The second value, which is the same for all keys and known as the stepsize, is repeatedly added to the starting value until a free space is found, or the
entire table is traversed.

newLocation = (startingValue + stepSize) % arraySize

This algorithm, which is used in open-addressed hash table
Hash table
In computer science, a hash table or hash map is a data structure that uses a hash function to map identifying values, known as keys , to their associated values . Thus, a hash table implements an associative array...

s, provides good memory caching (if stepsize is equal to one), through good locality of reference, but also results in clustering, an unfortunately high probability
Probability
Probability is ordinarily used to describe an attitude of mind towards some proposition of whose truth we arenot certain. The proposition of interest is usually of the form "Will a specific event occur?" The attitude of mind is of the form "How certain are we that the event will occur?" The...

 that where there has been one collision there will be more. The performance of linear probing is also more sensitive to input distribution when compared to double hashing
Double hashing
Double hashing is a computer programming technique used in hash tables to resolve hash collisions, cases when two different values to be searched for produce the same hash key...

.

Given an ordinary hash function H(x), a linear probing function (H(x, i)) would be:
Here H(x) is the starting value, n the size of the hash table, and the stepsize is i in this case.

Dictionary operation in constant time

Using linear probing, dictionary operation can be implemented in constant time. In other words, insert, remove and find operations can be implemented in O(1), as long as the load factor
Load factor
Load factor may refer to:* Load factor , the ratio of the lift of an aircraft to its weight* Load factor , the ratio of the number of records to the number of addresses within a data structure...

 of the hash table is a constant strictly less than one. This analysis makes the (unrealistic) assumption that the hash function is completely random, but can be extended also to 5-independent hash functions
K-independent hashing
A family of hash functions is said to be k-independent or k-universal if selecting a hash function at random from the family guarantees that the hash codes of any designated k keys are independent random variables...

. Weaker properties, such as universal hashing
Universal hashing
Using universal hashing refers to selecting a hash function at random from a family of hash functions with a certain mathematical property . This guarantees a low number of collisions in expectation, even if the data is chosen by an adversary...

, are not strong enough to ensure the constant-time operation of linear probing, but one practical method of hash function generation, tabulation hashing
Tabulation hashing
In computer science, tabulation hashing is a method for constructing universal families of hash functions by combining table lookup with exclusive or operations...

, again leads to a guaranteed constant expected time performance despite not being 5-independent.

See also

  • Double hashing
    Double hashing
    Double hashing is a computer programming technique used in hash tables to resolve hash collisions, cases when two different values to be searched for produce the same hash key...

  • Hash collision
    Hash collision
    Not to be confused with wireless packet collision.In computer science, a collision or clash is a situation that occurs when two distinct pieces of data have the same hash value, checksum, fingerprint, or cryptographic digest....

  • Hash function
    Hash function
    A hash function is any algorithm or subroutine that maps large data sets to smaller data sets, called keys. For example, a single integer can serve as an index to an array...

  • Quadratic probing
    Quadratic probing
    Quadratic probing is a scheme in computer programming for resolving collisions in hash tables.It is an open addressing method to handle overflows after a collision takes place in some bucket of a hash table....

  • Hash table#Collision resolution

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK