Y-fast trie
Encyclopedia
Y-fast trie
Type Trie
Trie
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...

Invented 1982
Invented by Dan Willard
Asymptotic complexity
in big O notation
Big O notation
In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

Space O(n)
Search O(log log M)
Insert O(log log M) amortized
Amortized analysis
In computer science, amortized analysis is a method of analyzing algorithms that considers the entire sequence of operations of the program. It allows for the establishment of a worst-case bound for the performance of an algorithm irrespective of the inputs by looking at all of the operations...

Delete O(log log M) amortized


In computer science
Computer science
Computer science or computing science is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems...

, a y-fast trie is a data structure
Data structure
In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.Different kinds of data structures are suited to different kinds of applications, and some are highly specialized to specific tasks...

 for storing integer
Integer
The integers are formed by the natural numbers together with the negatives of the non-zero natural numbers .They are known as Positive and Negative Integers respectively...

s from a bounded domain. It supports exact and predecessor or successor queries in time O(log log M), using O(n) space, where n is the number of stored values and M is the maximum value in the domain. The structure was proposed by Dan Willard in 1982 to decrease the O(n log M) space used by an x-fast trie
X-fast trie
In computer science, an x-fast trie is a data structure for storing integers from a bounded domain. It supports exact and predecessor or successor queries in time O, using O space, where n is the number of stored values and M is the maximum value in the domain...

.

Structure

A y-fast trie consists of two data structures: the top half is an x-fast trie and the lower half consists of a number of balanced binary trees. The keys are divided into groups of O(log M) consecutive elements and for each group a balanced binary search tree is created. To facilitate efficient insertion and deletion, each group contains at least (log M)/4 and at most 2 log M elements. For each balanced binary search tree a representative r is chosen. These representatives are stored in the x-fast trie. A representative r need not to be an element of the tree associated with it, but it does need be an integer smaller than the successor of r and the minimum element of the tree associated with that successor and greater than the predecessor of r and the maximum element of the tree associated with that predecessor. Initially, the representative of a tree will be an integer between the minimum and maximum element in its tree.

Since the x-fast trie stores O(n / log M) representatives and each representative occurs in O(log M) hash tables, this part of the y-fast trie uses O(n) space. The balanced binary search trees store n elements in total which uses O(n) space. Hence, in total a y-fast trie uses O(n) space.

Operations

Like van Emde Boas tree
Van Emde Boas tree
A van Emde Boas tree , also known as a vEB tree, is a tree data structure which implements an associative array with m-bit integer keys. It performs all operations in O time...

s and x-fast tries, y-fast tries support the operations of an ordered associative array
Associative array
In computer science, an associative array is an abstract data type composed of a collection of pairs, such that each possible key appears at most once in the collection....

. This includes the usual associative array operations, along with two more order operations, Successor and Predecessor:
  • Find(k): find the value associated with the given key
  • Successor(k): find the key/value pair with the smallest key larger than or equal to the given key
  • Predecessor(k): find the key/value pair with the largest key less than or equal to the given key
  • Insert(k, v): insert the given key/value pair
  • Delete(k): remove the key/value pair with the given key

Find

A key k can be stored in either the tree of the smallest representative r greater than k or in the tree of the predecessor of r since the representative of a binary search tree need not be an element stored in its tree. Hence, we first find the smallest representative r greater than k in the x-fast trie. Using this representative, we retrieve the predecessor of r. These two representatives point to two balanced binary search trees, which we both search for k.

Finding the smallest representative r greater than k in the x-fast trie takes O(log log M). Using r, finding its predecessor takes constant time. Searching the two balanced binary search trees containing O(log M) elements each takes O(log log M) time. Hence, a key k can be found, and its value retrieved, in O(log log M) time.

Successor and Predecessor

Similarly to the key k itself, its successor can be stored in either the tree of the smallest representative r greater than k or in the tree of the successor of r. Hence, to find the successor of a key k, we first search the x-fast trie for the smallest representative greater than k. Next, we use this representative to retrieve its predecessor in the x-fast trie. These two representatives point to two balanced binary search trees, which we search for the successor of k.

Finding the smallest representative r greater than k in the x-fast trie takes O(log log M) time and using r to find its predecessor takes constant time. Searching the two balanced binary search trees containing O(log M) elements each takes O(log log M) time. Hence, the successor of a key k can be found, and its value retrieved, in O(log log M) time.

Searching for the predecessor of a key k is highly similar to finding its successor. We search the x-fast trie for the largest representative r smaller than k and we use r to retrieve its predecessor in the x-fast trie. Finally, we search the two balanced binary search trees of these two representatives for the predecessor of k. This takes O(log log M) time.

Insert

To insert a new key/value pair (k, v), we first need to determine in which balanced binary search tree we need to insert k. To this end, we find the tree T containing the successor of k. Next, we insert k into T. To ensure that all balanced binary search trees contain O(log M) elements, we split T into two balanced binary trees and remove its representative from the x-fast trie if it contains more than 2 log M elements. Each of the two new balanced binary search trees contains at most log M + 1 elements. We pick a representative for each tree and insert these into the x-fast trie.

Finding the successor of k takes O(log log M) time. Inserting k into a balanced binary search tree that contains O(log M) elements also takes O(log log M) time. Splitting a binary search tree that contains O(log M) elements can be done in O(log M) time. Finally, inserting and deleting the three representatives takes O(log M) time. However, since we split the tree at most once every O(log M) insertions and deletions, this takes constant amortized time. Therefore, inserting a new key/value pair takes O(log log M) amortized time.

Delete

Deletions are very similar to insertions. We first find the key k in one of the balanced binary search trees and delete it from this tree T. To ensure that all balanced binary search trees contain O(log M) elements, we merge T with the balanced binary search tree of its successor or predecessor if it contains less than (log M)/4 elements. The representatives of the merged trees are removed from the x-fast trie. It is possible for the merged tree to contain more than 2 log M elements. If this is the case, the newly formed tree is split into two trees of about equal size. Next, we pick a new representative for each of the new trees and we insert these into the x-fast trie.

Finding the key k takes O(log log M) time. Deleting k from a balanced binary search tree that contains O(log M) elements also takes O(log log M) time. Merging and possibly splitting the balanced binary search trees takes O(log M) time. Finally, deleting the old representatives and inserting the new representatives into the x-fast trie takes O(log M) time. Merging and possibly splitting the balanced binary search tree, however, is done at most once for every O(log M) insertions and deletions. Hence, it takes constant amortized time. Therefore, deleting a key/value pair takes O(log log M) amortized time.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK