Kochanski multiplication
Encyclopedia
Kochanski multiplication is an algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 that allows modular arithmetic
Modular arithmetic
In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus....

 (multiplication or operations based on it, such as exponentiation
Modular exponentiation
Modular exponentiation is a type of exponentiation performed over a modulus. It is particularly useful in computer science, especially in the field of cryptography....

) to be performed efficiently when the modulus is large (typically several hundred bits). This has particular application in number theory
Number theory
Number theory is a branch of pure mathematics devoted primarily to the study of the integers. Number theorists study prime numbers as well...

 and in cryptography
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...

: for example, in the RSA cryptosystem and Diffie-Hellman key exchange
Diffie-Hellman key exchange
Diffie–Hellman key exchange Synonyms of Diffie–Hellman key exchange include:*Diffie–Hellman key agreement*Diffie–Hellman key establishment*Diffie–Hellman key negotiation...

.

The most common way of implementing large-integer multiplication in hardware is to express the multiplier in binary
Binary numeral system
The binary numeral system, or base-2 number system, represents numeric values using two symbols, 0 and 1. More specifically, the usual base-2 system is a positional notation with a radix of 2...

 and enumerate its bits, one bit at a time, starting with the most significant bit, perform the following operations on an accumulator
Accumulator (computing)
In a computer's central processing unit , an accumulator is a register in which intermediate arithmetic and logic results are stored. Without a register like an accumulator, it would be necessary to write the result of each calculation to main memory, perhaps only to be read right back again for...

:
  1. Double the contents of the accumulator (if the accumulator stores numbers in binary, as is usually the case, this is a simple "shift left" that requires no actual computation).
  2. If the current bit of the multiplier is 1, add the multiplicand into the accumulator; if it is 0, do nothing.


For an n-bit multiplier, this will take n clock cycles (where each cycle does either a shift or a shift-and-add).

To convert this into an algorithm for modular multiplication, with a modulus r, it is necessary to subtract r conditionally at each stage:

  1. Double the contents of the accumulator.
  2. If the result is greater than or equal to r, subtract r. (Equivalently, subtract r from the accumulator and store the result back into the accumulator if and only if it is non-negative).
  3. If the current bit of the multiplier is 1, add the multiplicand into the accumulator; if it is 0, do nothing.
  4. If the result of the addition is greater than or equal to r, subtract r. If no addition took place, do nothing.


This algorithm works. However, it is critically dependent on the speed of addition.

Addition of long integers suffers from the problem that carries have to be propagated from right to left and the final result is not known until this process has been completed. Carry propagation can be speeded up with carry look-ahead
Carry Look-Ahead Adder
A carry-lookahead adder is a type of adder used in digital logic. A carry-lookahead adder improves speed by reducing the amount of time required to determine carry bits...

 logic, but this still makes addition very much slower than it needs to be (for 512-bit addition, addition with carry look-ahead is 32 times slower than addition without carries at all).

Non-modular multiplication can make use of carry-save adders, which save time by storing the carries from each digit position and using them later: for example, by computing 111111111111+000000000010 as 111111111121 instead waiting for the carry to propagate through the whole number to yield the true binary value 1000000000001. That final propagation still has to be done to yield a binary result but this only needs to be done once at the very end of the multiplication.

Unfortunately the modular multiplication method outlined above needs to know the magnitude of the accumulated value at every step, in order to decide whether to subtract r: for example, if it needs to know whether the value in the accumulator is greater than 1000000000000, the carry-save representation 111111111121 is useless and needs to be converted to its true binary value for the comparison to be made.

It therefore seems that one can have either the speed of carry-save or modular multiplication, but not both.

Outline of the algorithm

The principle of the Kochanski algorithm is one of making guesses as to whether or not r should be subtracted, based on the most significant few bits of the carry-save value in the accumulator. Such a guess will be wrong some of the time, since there is no way of knowing whether latent carries in the less significant digits (which have not been examined) might not invalidate the result of the comparison. Thus:

  • A subtraction may not have been made when one was required. In that case the result in the accumulator is greater than r (although the algorithm doesn't know it yet), and so after the next shift left, 2r will need to be subtracted from the accumulator.


  • A subtraction may have been made when one was not required. In that case the result in the accumulator is less than 0 (although the algorithm doesn't know it yet), and so after the next shift left, r or even 2r will need to be added back to the accumulator to make it positive again.


What is happening is essentially a race between the errors that result from wrong guesses, which double with every shift left, and the corrections made by adding or subtracting multiples of r based on a guess of what the errors may be.

It turns out that examining the most significant 4 bits of the accumulator is sufficient to keep the errors within bounds and that the only values that need to be added to the accumulator are -2r, -r, 0, +r, and +2r, all of which can be generated instantaneously by simple shifts and negations.

At the end of a complete modular multiplication, the true binary result of the operation has to be evaluated and it is possible that an additional addition or subtraction of r will be needed as a result of the carries that are then discovered; but the cost of that extra step is small when amortized over the hundreds of shift-and-add steps that dominate the overall cost of the multiplication.

Alternatives

Brickell has published a similar algorithm that requires greater complexity in the electronics for each digit of the accumulator.

Montgomery multiplication
Montgomery reduction
In arithmetic computation, Montgomery reduction is an algorithm introduced in 1985 by Peter Montgomery that allows modular arithmetic to be performed efficiently when the modulus is large ....

is an alternative algorithm that processed the multiplier "backwards" (least significant digit first) and uses the least significant digit of the accumulator to control whether or not the modulus should be added/subtracted. This avoids the need for carries to propagate. However, the algorithm is impractical for single modular multiplications, since two or three additional Montgomery steps have to be performed to convert the operands into a special form before processing and to convert the result back into conventional binary at the end.
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK