Bitonic sorter
Encyclopedia
Bitonic mergesort is a parallel algorithm
Parallel algorithm
In computer science, a parallel algorithm or concurrent algorithm, as opposed to a traditional sequential algorithm, is an algorithm which can be executed a piece at a time on many different processing devices, and then put back together again at the end to get the correct result.Some algorithms...

 for sorting. It is also used as a construction method for building a sorting network
Sorting network
A sorting network is an abstract mathematical model of a network of wires and comparator modules that is used to sort a sequence of numbers. Each comparator connects two wires and sorts the values by outputting the smaller value to one wire, and the larger to the other...

. The algorithm was devised by Ken Batcher
Ken Batcher
Ken Batcher is an emeritus professor of Computer Science at Kent State University. He also worked as a computer architect at Goodyear Aerospace in Akron, Ohio for 28 years. In 1964, Batcher received his Ph.D. in electrical engineering from the University of Illinois...

. The resulting sorting networks consist of O(n log2(n)) comparators and have a delay of O(log2(n)), where n is the number of items to be sorted.

A sorted sequence is a monotonically non-decreasing (or non-increasing) sequence. A bitonic sequence is a sequence with for some , or a circular shift of such a sequence.

How the algorithm works

The following is a bitonic sorting network with 16 inputs:
The 16 numbers enter at the inputs at the left end, slide along each of the 16 horizontal wires, end exit at the outputs at the right end. The network is designed to sort the elements, with the largest number at the bottom.

The arrows are comparators. Whenever two numbers reach the two ends of an arrow, they are compared to ensure that the arrow points toward the larger number. If they are out of order, they are swapped. The colored boxes are just for illustration and have no effect on the algorithm.

Every red box has the same structure: each input in the top half is compared to the corresponding input in the bottom half, with all arrows pointing down (dark red) or all up (light red). If the inputs happen to form a bitonic sequence, then the output will form two bitonic sequences. The top half of the output will be bitonic, and the bottom half will be bitonic, with every element of the top half less than or equal to every element of the bottom half (for dark red) or vice versa (for light red). This theorem is not obvious, but can be verified by carefully considering all the cases of how the various inputs might compare, using the zero-one principle.

The red boxes combine to form blue and green boxes. Every such box has the same structure: a red box is applied to the entire input sequence, then to each half of the result, then to each half of each of those results, and so on. All arrows point down (blue) or all point up (green). This structure is known as a butterfly network. If the input to this box happens to be bitonic, then the output will be completely sorted in increasing order (blue) or decreasing order (green). This property is obvious. If a number enters the blue or green box, then the first red box will sort it into the correct half of the list. It will then pass through a smaller red box that sorts it into the correct quarter of the list within that half. This continues until it is sorted into exactly the correct position. Therefore, the output of the green or blue box will be completely sorted.

The green and blue boxes combine to form the entire sorting network. For any arbitrary sequence of inputs, it will sort them correctly, with the largest at the bottom. This property is obvious. The output of each green or blue box will be a sorted sequence, so the output of each pair of adjacent lists will be bitonic, because the top one is blue and the bottom one is green. Each column of blue and green boxes takes N sorted sequences and concatenates them in pairs to form N/2 bitonic sequences, which are then sorted by the boxes in that column to form N/2 sorted sequences. This process starts with each input considered to be a sorted list of one element, and continues through all the columns until the last merges them into a single, sorted list. Because the last stage was blue, this final list will have the largest element at the bottom.

Each green box performs the same operation as a blue box, but with the sort in the opposite direction. So, each green box could be replaced by a blue box followed by a crossover where all the wires move to the opposite position. This would allow all the arrows to point the same direction, but would prevent the horizontal lines from being straight. However, a similar crossover could be placed to the right of the bottom half of the outputs from any red block, and the sort would still work correctly, because the reverse of a bitonic sequence is still bitonic. If a red box then has a crossover before and after it, it can be rearranged internally so the two crossovers cancel, so the wires become straight again. Therefore, the following diagram is equivalent to the one above, where each green box has become a blue plus a crossover, and each orange box is a red box that absorbed two such crossovers:
The arrowheads are not drawn, because every comparator sorts in the same direction. The blue and red blocks perform the same operations as before. The orange blocks are equivalent to red blocks where the sequence order is reversed for the bottom half of its inputs and the bottom half of its outputs. This is the most common representation of a bitonic sorting network.

Example code

The following is an implementation of the bitonic mergesort sorting algorithm in Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

. The input is a boolean value up, and a list x of length a power of 2. The output is a sorted list that is ascending if up is true, and decreasing otherwise.

def bitonic_sort(up,x):
if len(x)<=1:
return x
else:
first = bitonic_sort(True,x[:len(x)/2])
second = bitonic_sort(False,x[len(x)/2:])
return bitonic_merge(up,first+second)

def bitonic_merge(up,x):
# assume input x is bitonic, and sorted list is returned
if len(x)

1:
return x
else:
bitonic_compare(up,x)
first = bitonic_merge(up,x[:len(x)/2])
second = bitonic_merge(up,x[len(x)/2:])
return first + second

def bitonic_compare(up,x):
dist = len(x)/2
for i in range(dist):
if (x[i] > x[i+dist])

up:
x[i], x[i+dist] = x[i+dist], x[i] #swap

>>> bitonic_sort(True,[10,30,11,20,4,330,21,110])
[4, 10, 11, 20, 21, 30, 110, 330]
>>> bitonic_sort(False,[10,30,11,20,4,330,21,110])
[330, 110, 30, 21, 20, 11, 10, 4]


The implementation in Java:

import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class BitonicMergeSort{

static > List bitonic_sort(boolean up, List list) {
if (list.size <= 1)
return list;
List first = new ArrayList(list.subList(0, list.size / 2));
first = bitonic_sort(up, first);
List second = new ArrayList(list.subList(list.size / 2, list.size));
second = bitonic_sort(!up, second);
first.addAll(second);
return bitonic_merge(up, first);
}

static > List bitonic_merge(boolean up, List list) {
if (list.size 1)
return list;
bitonic_compare(up, list);
List first = new ArrayList(list.subList(0, list.size / 2));
first = bitonic_merge(up, first);
List second = new ArrayList(list.subList(list.size / 2, list.size));
second = bitonic_merge(up, second);
first.addAll(second);
return first;
}

static > List bitonic_compare(boolean up, List list) {
int dist = list.size / 2;
for (int i = 0; i < dist; i++) {
int result = list.get(i).compareTo(list.get(i + dist));
if (up && result > 0 || !up && result < 0) {
T temp = list.get(i);
list.set(i, list.get(i + dist));
list.set(i + dist, temp);
}
}
return list;
}

public static void main(String[] args) {
List original = Arrays.asList(10, 30, 11, 20, 4, 330, 21, 110);
System.out.printf("Original List:\n %s \n", original);

List ascSorted = bitonic_sort(true, original);
System.out.printf("\nSorted List (Ascending order):\n %s \n", ascSorted);

List descSorted = bitonic_sort(false, original);
System.out.printf("\nSorted List (Descending order):\n %s \n", descSorted);
}
}


The below can be used on a parallel architecture to sort efficiently on multiple processors. The following code is very simplified - it assumes that 2^k processors (where k is an integer) are being used and the list is 2^n elements long. The code is written in MPICC. Comments are included for clarity.


...
int main(int argc, char **argv){

/* Set up MPI */
MPI_Init( &argc, &argv );
MPI_Comm_rank( MPI_COMM_WORLD, &my_rank ); /* Get My Process Rank */
MPI_Comm_size( MPI_COMM_WORLD, &p ); /* How many processes are being used */

/* We will assume that the array (arr) of length SIZE, consists of random integers. */
/* The master process must distribute the list among all available processors. Each processor must receive the list */
/* and sort it in the proper order */
if( my_rank 0 ){

int i;
int n_p = SIZE / p; /* Number of elements per processor */

/* Store all needed local quantities for the master process */
localSize = n_p;
localUp = 1;
localArr = malloc( n_p * sizeof( int ) );

/* Distribute the array portions amongst all of the available processors */
/* Alternating the sort orders */
int up, currentIndex;
for(i = 1; i <= p; i++){
up = (i + 1) % 2; //Must be 0 for even numbered sets (odds) and 1 for odds
currentIndex = i * n_p;

MPI_Send( &up, 1, MPI_INT, i, tag, MPI_COMM_WORLD );
MPI_Send( &n_p, 1, MPI_INT, i, tag, MPI_COMM_WORLD );
MPI_Send( &(arr[currentIndex]), n_p, MPI_INT, i, tag, MPI_COMM_WORLD );
}//end for : sending to all processors

/* Load the first n/p elements into the master's local array */
/* (This was done first to allow the other processors to begin sorting) */
for(i = 0; i < n_p; i++){
localArr[i] = arr[i];
}//end for : Load the master's local array

/* Have the master processor sort the local portion of the list */
bitonic_sort( 1, localArr, 0, n_p - 1 );

}else{
/* Receive the assignment data from the master process */
int n, up;
MPI_Recv( &up, 1, MPI_INT, source, tag, MPI_COMM_WORLD, &status );
MPI_Recv( &n, 1, MPI_INT, source, tag, MPI_COMM_WORLD, &status );
int arr[n];
MPI_Recv( &arr, n, MPI_INT, source, tag, MPI_COMM_WORLD, &status );

/* Store all values into the local variables */
localSize = n;
localUp = up;
localArr = malloc( n * sizeof( int ) );
int i;
for(i = 0; i < n; i++){
localArr[i] = arr[i];
}//end for : load local array

/* Perform the Assigned sorting task*/
bitonic_sort( up, localArr, 0, n - 1 );

}//end if-else : sort initial lists

/* All processors must wait for the other to complete their local sorts before */
/* continuing the sort */
MPI_Barrier( MPI_COMM_WORLD );


/* Begin inter-processor merging operations */
int i,j;
for( i = 0; i < log_p + 1 ; i++ ){
for( j = i; j > 0; j-- ){

/* compare and exchange elements with the processor obtained by inverting jth bit of id*/
int id[p];
getBinary( my_rank, id, p);
int ithBit = id[i];
flipBit( id, j ); /* Flip the jth bit of id to find processor to exchange with*/
int partnerId = getDecimal( id, p);


if( ithBit 0 ){
if( my_rank > partnerId ){
swap_with( partnerId, 0);
}else if( my_rank < partnerId ){
swap_with( partnerId, 1);
}else{
bitonic_merge( 1, localArr, 0, localSize - 1 );
}//end if-else
}else{
if( my_rank > partnerId ){
swap_with( partnerId, 1);
}else if( my_rank < partnerId ){
swap_with( partnerId, 0);
}else{
bitonic_merge( 0, localArr, 0, localSize - 1 );
}//end if-else
}//end if-else

}//end for
}//end for


/* Perform 1 final local merge on all processors */
bitonic_merge( 1, localArr, 0, localSize - 1 );


MPI_Finalize; /* Perform Cleanup tasks */

...

}//end main method


External links
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK