Lenstra elliptic curve factorization

The Lenstra elliptic curve factorization or the elliptic curve factorization method (ECM) is a fast, sub-exponential running time algorithm for integer factorization

Integer factorization

In number theory, integer factorization or prime factorization is the decomposition of a composite number into smaller non-trivial divisors, which when multiplied together equal the original integer....

which employs elliptic curve

Elliptic curve

In mathematics, an elliptic curve is a smooth, projective algebraic curve of genus one, on which there is a specified point O. An elliptic curve is in fact an abelian variety — that is, it has a multiplication defined algebraically with respect to which it is a group — and O serves as the identity...

s. For general purpose factoring, ECM is the third-fastest known factoring method. The second fastest is the multiple polynomial quadratic sieve

Quadratic sieve

The quadratic sieve algorithm is a modern integer factorization algorithm and, in practice, the second fastest method known . It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve...

and the fastest is the general number field sieve

General number field sieve

In number theory, the general number field sieve is the most efficient classical algorithm known for factoring integers larger than 100 digits...

. The Lenstra elliptic curve factorization is named after Hendrik Lenstra

Hendrik Lenstra

Hendrik Willem Lenstra, Jr. is a Dutch mathematician.-Biography:Lenstra received his doctorate from the University of Amsterdam in 1977 and became a professor there in 1978...

.

Practically speaking, ECM is considered a special purpose factoring algorithm as it is most suitable for finding small factors. , it is still the best algorithm for divisor

Divisor

In mathematics, a divisor of an integer n, also called a factor of n, is an integer which divides n without leaving a remainder.-Explanation:...

s not greatly exceeding 20 to 25 digits

Decimal

The decimal numeral system has ten as its base. It is the numerical base most widely used by modern civilizations....

(64 to 83 bits or so), as its running time is dominated by the size of the smallest factor p rather than by the size of the number n to be factored. Frequently, ECM is used to remove small factors from a very large integer with many factors; if the remaining integer is still composite, then it has only large factors and is factored using general purpose techniques. The largest factor found using ECM so far has 73 digits and was discovered on 6 March 2010 by Joppe Bos, Thorsten Kleinjung, Arjen Lenstra

Arjen Lenstra

Arjen Klaas Lenstra is a Dutch mathematician. He studied mathematics at the University of Amsterdam.He is currently a professor at the EPFL , in the Laboratory for Cryptologic Algorithms, and...

and Peter Montgomery

Peter Montgomery

Peter Lawrence Montgomery is an American mathematician who has published widely in the more mathematical end of the field of cryptography. He is currently a researcher in the cryptography group at Microsoft Research....

. Increasing the number of curves tested improves the chances of finding a factor, but they are not linear

Linear

In mathematics, a linear map or function f is a function which satisfies the following two properties:* Additivity : f = f + f...

with the increase in the number of digits.

Lenstra's elliptic curve factorization

The Lenstra elliptic curve factorization method to find a factor of the given number n works as follows:

Pick a random elliptic curve
Elliptic curve
In mathematics, an elliptic curve is a smooth, projective algebraic curve of genus one, on which there is a specified point O. An elliptic curve is in fact an abelian variety — that is, it has a multiplication defined algebraically with respect to which it is a group — and O serves as the identity...

over , given by an equation of the form y² = x³ + ax + b (mod n), and a non-trivial point
Point (geometry)
In geometry, topology and related branches of mathematics a spatial point is a primitive notion upon which other concepts may be defined. In geometry, points are zero-dimensional; i.e., they do not have volume, area, length, or any other higher-dimensional analogue. In branches of mathematics...

P on it. Say, pick first a point P=(x,y) with random non-zero coordinates x,y (mod n), then pick a random non-zero a (mod n), then take b = y² - x³ - ax (mod n).
We will compute certain multiples (k times) of the point P using the standard addition rule on our elliptic curve: given two points P,Q on the curve, their sum can be computed using the formulas given in the group law section of the article on elliptic curves. These formulas contain the "slope" of the line connecting P and Q, hence involve divisions between residue classes modulo n, which can be performed using the extended Euclidean algorithm
Extended Euclidean algorithm
The extended Euclidean algorithm is an extension to the Euclidean algorithm. Besides finding the greatest common divisor of integers a and b, as the Euclidean algorithm does, it also finds integers x and y that satisfy Bézout's identityThe extended Euclidean algorithm is particularly useful when a...

. In particular, division by some v (mod n) includes the calculation of the greatest common divisor
Greatest common divisor
In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...

gcd(v,n).

If the slope is of the form u/v with gcd(u,n)=1, then v=0 (mod n) means that the result of the -addition will be , the point at infinity on the curve. However, if gcd(v,n) is neither 1 nor n, then the -addition will not produce a meaningful point on the curve, which shows that our elliptic curve is not a group (mod n), but, more importantly for now, gcd(v,n) is a non-trivial factor of n.
Compute eP on the elliptic curve (mod n), where e is product of many small numbers: say, a product of small primes raised to small powers, as in the p − 1 algorithm, or the factorial
Factorial
In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n...

B! for some not too large B. This can be done efficiently, one small factor at a time. Say, to get B!P, first compute 2P, then 3(2P), then 4(3!P), and so on. Of course, B should be small enough so that B-wise -addition can be performed in reasonable time.
- If we were able to finish all the calculations above without encountering non-invertible elements (mod n), then we need to try again with some other curve and starting point.
- If we found at some stage (the point at infinity on the elliptic curve), then again we should start over with a new curve and starting point, since is the zero element for , so after this point we would be just repeating .
- If we encountered a gcd(v,n) at some stage that was neither 1 nor n, then we are done: it is a non-trivial factor of n.

The time complexity depends on the size of the factor and can be represented by O

Big O notation

In mathematics, big O notation is used to describe the limiting behavior of a function when the argument tends towards a particular value or infinity, usually in terms of simpler functions. It is a member of a larger family of notations that is called Landau notation, Bachmann-Landau notation, or...

E (mathematical constant)

The mathematical constant ' is the unique real number such that the value of the derivative of the function at the point is equal to 1. The function so defined is called the exponential function, and its inverse is the natural logarithm, or logarithm to base...

^{(√2 + o(1)) √(lnNatural logarithm
The natural logarithm is the logarithm to the base e, where e is an irrational and transcendental constant approximately equal to 2.718281828...

p ln ln p)}), where p is the smallest factor of n.

Why does the algorithm work?

If p and q are two prime divisors of n, then y² = x³ + ax + b (mod n) implies the same equation also (mod p) and (mod q). These two smaller elliptic curves with the

-addition are now genuine groups

Group (mathematics)

In mathematics, a group is an algebraic structure consisting of a set together with an operation that combines any two of its elements to form a third element. To qualify as a group, the set and the operation must satisfy a few conditions called group axioms, namely closure, associativity, identity...

. If these groups have N_p and N_q elements, respectively, then for any point P on the original curve, by Lagrange's theorem

Lagrange's theorem (group theory)

Lagrange's theorem, in the mathematics of group theory, states that for any finite group G, the order of every subgroup H of G divides the order of G. The theorem is named after Joseph Lagrange....

on the curve (mod p) implies that k divides N_p; moreover,

. The analogous statement holds for q. When the elliptic curve is chosen randomly, then N_p and N_q are random numbers close to p+1 and q+1, respectively (see below). Hence it is unlikely that most of the prime factors of N_p and N_q are the same, and it is quite likely that while computing eP, we will encounter some kP that is

(mod p) but not (mod q), or vice versa. When this is the case, kP does not exist on the original curve, and in the computations we found some v with either gcd(v,p)=p or gcd(v,q)=q, but not both. That is, gcd(v,n) gave a non-trivial factor of n.

ECM is at its core an improvement of the older p − 1 algorithm. The p − 1 algorithm finds prime factors p such that p − 1 is b-powersmooth

Smooth number

In number theory, a smooth number is an integer which factors completely into small prime numbers. The term seems to have been coined by Leonard Adleman. Smooth numbers are especially important in cryptography relying on factorization.-Definition:...

for small values of b. For any e, a multiple of p − 1, and any a relatively prime to p, by Fermat's little theorem

Fermat's little theorem

Fermat's little theorem states that if p is a prime number, then for any integer a, a p − a will be evenly divisible by p...

we have a^e ≡ 1 (mod

Modular arithmetic

In mathematics, modular arithmetic is a system of arithmetic for integers, where numbers "wrap around" after they reach a certain value—the modulus....

p). Then gcd

Greatest common divisor

In mathematics, the greatest common divisor , also known as the greatest common factor , or highest common factor , of two or more non-zero integers, is the largest positive integer that divides the numbers without a remainder.For example, the GCD of 8 and 12 is 4.This notion can be extended to...

(a^e − 1, n) is likely to produce a factor of n. However, the algorithm fails when p-1 has large prime factors, as is the case for numbers containing strong prime

Strong prime

In mathematics, a strong prime is a prime number with certain special properties. The definitions of strong primes are different in cryptography and number theory.- Definition in cryptography :...

s, for example.

ECM gets around this obstacle by considering the group

Group (mathematics)

of a random elliptic curve

Elliptic curve

over the finite field

Finite field

In abstract algebra, a finite field or Galois field is a field that contains a finite number of elements. Finite fields are important in number theory, algebraic geometry, Galois theory, cryptography, and coding theory...

Z_p, rather than considering the multiplicative group

Multiplicative group

In mathematics and group theory the term multiplicative group refers to one of the following concepts, depending on the context*any group \scriptstyle\mathfrak \,\! whose binary operation is written in multiplicative notation ,*the underlying group under multiplication of the invertible elements of...

of Z_p which always has order p − 1.

The order of the group of an elliptic curve over Z_p varies (quite randomly) between p + 1 − 2√p and p + 1 + 2√p by Hasse's theorem, and is likely to be smooth for some elliptic curves. Although there is no proof that a smooth group order will be found in the Hasse-interval, by using heuristic

Heuristic

Heuristic refers to experience-based techniques for problem solving, learning, and discovery. Heuristic methods are used to speed up the process of finding a satisfactory solution, where an exhaustive search is impractical...

probabilistic methods, the Canfield-Erdős-Pomerance theorem with suitably optimized parameter choices, and the L-notation

L-notation

L-notation is an asymptotic notation analogous to big-O notation, denoted as L_n[\alpha,c] for a bound variable n tending to infinity. Like big-O notation, it is usually used to roughly convey the computational complexity of a particular algorithm....

, we can expect to try L

L-notation

[√2 / 2, √2] curves before getting a smooth group order. This heuristic estimate is very reliable in practice.

An example

The following example is from , with some details added.

We want to factor n=455839. Let's choose the elliptic curve y² = x³ + 5x - 5, with the point P=(1,1) on it, and let's try to compute (10!)P.

First we compute 2P. The slope of the tangent line at P is s=(3x²+5)/(2y)=4, and then the coordinates of 2P=(x′,y′) are x′=s²-2x=14 and y′=s(x-x′)-y=4(1-14)-1=-53, all numbers understood (mod n). Just to check that this 2P is indeed on the curve: (-53)²=2809=14³+5·14-5.

Then we compute 3(2P). The slope of the tangent line at 2P is s=(3·14²+5)/(2(-53))=-593/106 (mod n). Using the Euclidean algorithm

Euclidean algorithm

In mathematics, the Euclidean algorithm is an efficient method for computing the greatest common divisor of two integers, also known as the greatest common factor or highest common factor...

: 455839=4300·106+39, then 106=2·39+28, then 39=28+11, then 28=2·11+6, then 11=6+5, then 6=5+1. Hence gcd(455839,106)=1, and working backwards (a version of the extended Euclidean algorithm

Extended Euclidean algorithm

The extended Euclidean algorithm is an extension to the Euclidean algorithm. Besides finding the greatest common divisor of integers a and b, as the Euclidean algorithm does, it also finds integers x and y that satisfy Bézout's identityThe extended Euclidean algorithm is particularly useful when a...

): 1=6-5=2·6-11=2·28-5·11=7·28-5·39=7·106-19·39=81707·106-19·455839. Hence 106^-1=81707 (mod 455839), and -593/106=133317 (mod 455839). Given this s, we can compute the coordinates of 2(2P), just as we did above: 4P=(259851,116255). Just to check that this is indeed a point on the curve: y²=54514=x³+5x-5 (mod 455839). After this, we can compute

.

We can similarly compute 4!P, and so on, but 8!P requires inverting 599 (mod 455839). The Euclidean algorithm gives that 455839 is divisible by 599, and we have found a factorization 455839=599·761.

The reason that this worked is that the curve (mod 599) has 640=2⁷·5 points, while (mod 761) it has 777=3·7·37 points. Moreover, 640 and 777 are the smallest positive integers k such that

on the curve (mod 599) and (mod 761), respectively. Since 8! is a multiple of 640 but not a multiple of 777, we have

on the curve (mod 599), but not on the curve (mod 761), hence the repeated addition broke down here, yielding the factorization.

The algorithm with projective coordinates

Before we consider the projective plane over

/~, first have a look at the 'normal' projective space

Projective space

In mathematics a projective space is a set of elements similar to the set P of lines through the origin of a vector space V. The cases when V=R2 or V=R3 are the projective line and the projective plane, respectively....

over

. Now, instead of studying the points, the lines through the origin will be studied. The line can be represented a non-zero point

if you use the equivalence relation ~ on it, given by:

if and only if there exists a non-zero number

such that

and

. Due to this equivalence relation the space will be called a plane. In the projective plane, points, denoted by

, are 'the same' as lines in a threedimensional space that go through the origin. Remark that the point

does not exist here, because these does not represent a line. Now we observe that almost all lines go to the

-plane, except from the lines parallel to this plane. This lines are in the projective plane the 'points of infinity' that are used in the affine

-plane above.

In the algorithm only the group structure of an elliptic curve over the field

is used. Therefore we not necessarily need the field

. A finite field will also provide a group structure on the elliptic curve. However, considering the same curve and operation over

/~ with

not a prime does not give a group. The Elliptic Curve Method makes use of the failure cases of the addition law.

We now state the algorithm in projective coordinates. The neutral element is then given by the point in infinity

. Let

be a (positive) integer and consider the elliptic curve (a set of points with some structure on it)

Pick in and let be unequal to zero.
Calculate . The elliptic curve is then in Weierstrass form given by and by using projective coordinates the elliptic curve is given by the homogenous equation . It has the point .
Choose an upperbound for this elliptic curve. Remark: You will only find factors if the group order of the elliptic curve over (denoted by #) is B-smooth
Smooth number
In number theory, a smooth number is an integer which factors completely into small prime numbers. The term seems to have been coined by Leonard Adleman. Smooth numbers are especially important in cryptography relying on factorization.-Definition:...

, which means that all prime factors of # have to be less or equal to .
Calculate .
Calculate (k times) in the ring . Note that if # is -smooth and is prime (and therefore is a field) that . However, if only # is B-smooth for some divisor of , the product might not be (0:1:0) because addition and multiplication are not well-defined if is not prime. In this case, a non-trivial divisor can be found!
If not, then go back to step 2. If this does occur, then you will notice this when simplifying the product .

In point 5 it is said that under the right circumstances a non-trivial divisor can be found. As pointed out in Lenstra's article (Factoring Integers with Elliptic Curves) the addition needs the assumption

. If

are not

and distinct (otherwise addition works similar, but a little different), then addition works as follows:

To calculate: , ,
,
,
,
.

If addition fails, this will be due calculating

fails. In particular, because

can not always be calculated if

is not prime (and therefore

is not a field). Without making use of

being a field, one could calculate:

,
,
,
, and simplify if possible.

This calculation is always legal and if the gcd of the

-coordinate with

is unequal to 1 or

, so when simplifying fails, then a non-trivial divisor of

is found.

Twisted Edwards curves

The use of Edwards curve

Edwards curve

In mathematics, an Edwards curve is a new representation of elliptic curves, discovered by Harold M. Edwards in 2007. The concept of elliptic curves over finite fields is widely used in elliptic curve cryptography...

s needs fewer modular multiplications and less time then the use of Montgomery curves or Weierstrass curves (other used methods). Using Edwards curves you can also find more primes.

Definition:
Let

be a field in which

, and let

with

. Then the twisted Edwards curve

is given by

An Edwards curve is a twisted Edwards curve in which

.

There are five known ways to build a set of point on an Edwards curve: the set of affine points, the set of projective points, the set of inverted points, the set of extended points and the set of completed points.

The set of affine points is given by:

.

The addition law is given by

. The point (0,1) is its neutral element and the negative of

.
The other representations are defined similar to how the projective Weierstrass curve follows from the affine.

Any elliptic curve

Elliptic curve

in Edwards form has a point of order 4. So the torsion group of an Edwards curve over

is isomorphic to either

.

The most interesting cases for ECM are

and

, since they force the group orders of the curve modulo primes to be divisible by 12 and 16 respectively.
The following curves have a torsion group isomorphic to

with point where and
with point where and

Every Edwards curve with a point of order 3 can be written in the ways shown above.
Curves with torsion group isomorphic to

and

can be found on http://eprint.iacr.org/2008/016 page 30-32.

Stage 2

The above text is about the first stage of elliptic curve factorisation. There one hopes to find a prime divisor

such that

is the neutral element of

.
In the second stage one hopes to have found a prime divisor

such that

has small prime order in

.

We hope the order to be between

and

, where

is determined in stage 1 and

is new stage 2 parameter.
Checking for a small order of

, can be done by computing

modulo

for each prime

Success probability using EECM-MPFQ

For speedup techniques using Edward curves and implementation results, see: http://eprint.iacr.org/2008/016 page 30-32.

Hyperelliptic Curve Method (HECM)

There are recent developments in using hyperelliptic curves to factor integers. Cosset shows in his article (of 2010) that one can build a hyperelliptic curve with genus two (so a curve

with

of degree 5) which gives the same result as using two 'normal' elliptic curves at the same time. By making use of the Kummer Surface calculation is more efficient. The disadvantages of the hyperelliptic curve (versus an elliptic curve) are compensated by this alternative way of calculating. Therefore Cosset roughly claims that using hyperelliptic curves for factorization is no worse than using elliptic curves.

External links

Factorization using the Elliptic Curve Method, a Java applet which uses ECM and switches to the Self-Initializing Quadratic Sieve
Quadratic sieve
The quadratic sieve algorithm is a modern integer factorization algorithm and, in practice, the second fastest method known . It is still the fastest for integers under 100 decimal digits or so, and is considerably simpler than the number field sieve...

when it is faster.
GMP-ECM, an efficient implementation of ECM.
ECMNet, an easy client-server implementation that works with several factorization projects.
pyecm, a python implementation of ECM. Much faster with psyco and/or gmpy.
Distributed computing project yoyo@Home Subproject ECM is a program for Elliptic Curve Factorization which is used by a couple of projects to find factors for different kind of numbers.

The source of this article is wikipedia, the free encyclopedia. The text of this article is licensed under the GFDL.