Cycle sort
Encyclopedia
Cycle sort is an in-place, unstable sorting algorithm
Sorting algorithm
In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most-used orders are numerical order and lexicographical order...

, a comparison sort
Comparison sort
A comparison sort is a type of sorting algorithm that only reads the list elements through a single abstract comparison operation that determines which of two elements should occur first in the final sorted list...

 that is theoretically optimal in terms of the total number of writes to the original array, unlike any other in-place sorting algorithm. It is based on the idea that the permutation
Permutation
In mathematics, the notion of permutation is used with several slightly different meanings, all related to the act of permuting objects or values. Informally, a permutation of a set of objects is an arrangement of those objects into a particular order...

 to be sorted can be factored into cycles
Cycle notation
In combinatorial mathematics, the cycle notation is a useful convention for writing down a permutation in terms of its constituent cycles. This is also called circular notation and the permutation called a cyclic or circular permutation....

, which can individually be rotated to give a sorted result.

Unlike nearly every other sort, items are never written elsewhere in the array simply to push them out of the way of the action. Each value is either written zero times, if it's already in its correct position, or written one time to its correct position. This matches the minimal number of overwrites required for a completed in-place sort.

Minimizing the number of writes is useful when making writes to some huge data set is very expensive, such as with EEPROM
EEPROM
EEPROM stands for Electrically Erasable Programmable Read-Only Memory and is a type of non-volatile memory used in computers and other electronic devices to store small amounts of data that must be saved when power is removed, e.g., calibration...

s or Flash memory
Flash memory
Flash memory is a non-volatile computer storage chip that can be electrically erased and reprogrammed. It was developed from EEPROM and must be erased in fairly large blocks before these can be rewritten with new data...

 where each write reduces the lifespan of the memory.

Algorithm

The following algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 finds cycles and rotates them, giving a sorted result. Note that range(a, b) goes from a to b ‑ 1.
  1. Sort an array in place and return the number of writes.

def cycleSort(array):
writes = 0

# Loop through the array to find cycles to rotate.
for cycleStart in range(0, len(array) - 1):
item = array[cycleStart]

# Find where to put the item.
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1

# If the item is already there, this is not a cycle.
if pos

cycleStart:
continue

# Otherwise, put the item there or right after any duplicates.
while item

array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1

# Rotate the rest of the cycle.
while pos != cycleStart:

# Find where to put the item.
pos = cycleStart
for i in range(cycleStart + 1, len(array)):
if array[i] < item:
pos += 1

# Put the item there or right after any duplicates.
while item

array[pos]:
pos += 1
array[pos], item = item, array[pos]
writes += 1

return writes

Specific-situation optimizations
When the array contains only duplicates of a relatively small number of items, a constant-time perfect hash function
Perfect hash function
A perfect hash function for a set S is a hash function that maps distinct elements in S to a set of integers, with no collisions. A perfect hash function has many of the same applications as other hash functions, but with the advantage that no collision resolution has to be implemented.- Properties...

 can greatly speed up finding where to put an item, turning the sort from Θ(n2) time to Θ(n + k) time, where k is the total number of hashes. The array ends up sorted in the order of the hashes, so choosing a hash function that gives you the right ordering is important.

Before the sort, create a histogram
Histogram
In statistics, a histogram is a graphical representation showing a visual impression of the distribution of data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson...

, sorted by hash, counting the number of occurrences of each hash in the array. Then create a table with the cumulative sum of each entry in the histogram. The cumulative sum table will then contain the position in the array of each element. The proper place of elements can then be found by a constant-time hashing and cumulative sum table lookup rather than a linear search
Linear search
In computer science, linear search or sequential search is a method for finding a particular value in a list, that consists of checking every one of its elements, one at a time and in sequence, until the desired one is found....

.
External links
"Cycle-Sort: A Linear Sorting Method", The Computer Journal (1990) 33 (4): 365-367.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK