Burstsort
Encyclopedia
Burstsort and its variants are cache-efficient algorithms for sorting strings
String (computer science)
In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet....

 and are faster than quicksort and radix sort
Radix sort
In computer science, radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value...

 for large data set
Data set
A data set is a collection of data, usually presented in tabular form. Each column represents a particular variable. Each row corresponds to a given member of the data set in question. Its values for each of the variables, such as height and weight of an object or values of random numbers. Each...

s.

Burstsort algorithms use a trie
Trie
In computer science, a trie, or prefix tree, is an ordered tree data structure that is used to store an associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the...

 to store prefixes of strings, with growable arrays
Dynamic array
In computer science, a dynamic array, growable array, resizable array, dynamic table, or array list is a random access, variable-size list data structure that allows elements to be added or removed...

of pointers as end nodes containing sorted, unique, suffixes (referred to as buckets). Some variants copy the string tails into the buckets. As the buckets grow beyond a predetermined threshold, the buckets are "burst", giving the sort its name. A more recent variant uses a bucket index with smaller sub-buckets to reduce memory usage. Most implementations delegate to multikey quicksort, an extension of three-way radix quicksort, to sort the contents of the buckets. By dividing the input into buckets with common prefixes, the sorting can be done in a cache-efficient manner.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK