Autokey cipher
Encyclopedia
An autokey cipher is a cipher
which incorporates the message (the plaintext
) into the key
. There are two forms of autokey cipher: key autokey and text autokey ciphers. A key-autokey cipher uses previous members of the keystream
to determine the next element in the keystream. A text-autokey uses the previous message text to determine the next element in the keystream.
In modern cryptography, self-synchronizing stream cipher
s are autokey ciphers.
using a "reciprocal table" with five alphabets of his invention and another form was described in 1586 by Blaise de Vigenère
with a similar reciprocal table of ten alphabets.
One popular form of autokey starts with a tabula recta
, a square with 26 copies of the alphabet, the first line starting with 'A', the next line starting with 'B', etc., like the one above. In order to encrypt a plaintext, one locates the row with the first letter to be encrypted, and the column with the first letter of the key. The letter where the line and column cross is the ciphertext letter.
Giovan Battista Bellaso
used the first letter of each word as a primer to start his text autokey. Blaise de Vigenère
used as a primer an agreed-upon single letter of the alphabet.
The autokey cipher as used by the members of the American Cryptogram Association
is in the way the key is generated. It starts with a relatively short keyword, and appends the message to it. So if the keyword is "QUEENLY", and the message is "ATTACK AT DAWN", the key would be "QUEENLYATTACKATDAWN".
Plaintext: ATTACK AT DAWN...
Key: QUEENL YA TTACK AT DAWN....
Ciphertext: QNXEPV YT WTWP...
The ciphertext message would therefore be "QNXEPVYTWTWP".
plaintext: MEETATTHEFOUNTAIN (unknown)
key: KILTMEETATTHEFOUN (unknown)
ciphertext: WMPMMXXAEYHBRYOCA (known)
We try common
words, bigram
s, trigram
s etc. in all possible positions in the key. For example, "THE":
ciphertext: WMP MMX XAE YHB RYO CA
key: THE THE THE THE THE ..
plaintext: DFL TFT ETA FAX YRK ..
ciphertext: W MPM MXX AEY HBR YOC A
key: . THE THE THE THE THE .
plaintext: . TII TQT HXU OUN FHY .
ciphertext: WM PMM XXA EYH BRY OCA
key: .. THE THE THE THE THE
plaintext: .. WFI EQW LRD IKU VVW
We sort the plaintext fragments in order of likelihood:
unlikely <------------------> promising
EQW DFL TFT ... ... ... ... ETA OUN FAX
We know that a correct plaintext fragment will also appear in the key, shifted right by the length of the keyword. Similarly our guessed key fragment ("THE") will also appear in the plaintext shifted left. So by guessing keyword lengths (probably between 3 and 12) we can reveal more plaintext and key.
Trying this with "OUN" (possibly after wasting some time with the others):
shift by 4:
ciphertext: WMPMMXXAEYHBRYOCA
key: ......ETA.THE.OUN
plaintext: ......THE.OUN.AIN
by 5:
ciphertext: WMPMMXXAEYHBRYOCA
key: .....EQW..THE..OU
plaintext: .....THE..OUN..OG
by 6:
ciphertext: WMPMMXXAEYHBRYOCA
key: ....TQT...THE...O
plaintext: ....THE...OUN...M
We see that a shift of 4 looks good (both of the others have unlikely Qs), so we shift the revealed "ETA" back by 4 into the plaintext:
ciphertext: WMPMMXXAEYHBRYOCA
key: ..LTM.ETA.THE.OUN
plaintext: ..ETA.THE.OUN.AIN
We have a lot to work with now. The keyword is probably 4 characters long ("..LT"), and we have some of the message:
M.ETA.THE.OUN.AIN
Because our plaintext guesses have an effect on the key 4 characters to the left, we get feedback on correct/incorrect guesses, so we can quickly fill in the gaps:
MEETATTHEFOUNTAIN
The ease of cryptanalysis is thanks to the feedback from the relationship between plaintext and key. A 3-character guess reveals 6 more characters, which then reveal further characters, creating a cascade effect, allowing us to rule out incorrect guesses quickly.
s are based on pseudorandom number generator
s: the key is used to initialize the generator, and either key bytes or plaintext bytes are fed back into the generator to produce more bytes.
Some stream cipher
s are said to be "self-synchronizing", because the next key byte usually depends only on the previous N bytes of the message. If a byte in the message is lost or corrupted, therefore, the key-stream will also be corrupted--but only until N bytes have been processed. At that point the keystream goes back to normal, and the rest of the message will decrypt correctly.
Cipher
In cryptography, a cipher is an algorithm for performing encryption or decryption — a series of well-defined steps that can be followed as a procedure. An alternative, less common term is encipherment. In non-technical usage, a “cipher” is the same thing as a “code”; however, the concepts...
which incorporates the message (the plaintext
Plaintext
In cryptography, plaintext is information a sender wishes to transmit to a receiver. Cleartext is often used as a synonym. Before the computer era, plaintext most commonly meant message text in the language of the communicating parties....
) into the key
Key (cryptography)
In cryptography, a key is a piece of information that determines the functional output of a cryptographic algorithm or cipher. Without a key, the algorithm would produce no useful result. In encryption, a key specifies the particular transformation of plaintext into ciphertext, or vice versa...
. There are two forms of autokey cipher: key autokey and text autokey ciphers. A key-autokey cipher uses previous members of the keystream
Keystream
In cryptography, a keystream is a stream of random or pseudorandom characters that are combined with a plaintext message to produce an encrypted message ....
to determine the next element in the keystream. A text-autokey uses the previous message text to determine the next element in the keystream.
In modern cryptography, self-synchronizing stream cipher
Stream cipher
In cryptography, a stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream . In a stream cipher the plaintext digits are encrypted one at a time, and the transformation of successive digits varies during the encryption...
s are autokey ciphers.
History
The first autokey cipher was invented by Girolamo Cardano, and contained a fatal defect. Like many autokey ciphers it used the plaintext to encrypt itself; however, since there was no additional key, it is no easier for the intended recipient to read the message than anyone else who knows that the cipher is being used. A number of attempts were made by other cryptographers to produce a system that was neither trivial to break nor too difficult for the intended recipient to decipher. Eventually one was invented in 1564 by Giovan Battista BellasoGiovan Battista Bellaso
-Biography:Bellaso was born of a distinguished family in 1505. His father was Piervincenzo, a patrician of Brescia, owner since the 15th century of a house in town and a suburban estate in Capriano, in a neighborhood called Fenili Belasi , including the Holy Trinity chapel. The chaplain was...
using a "reciprocal table" with five alphabets of his invention and another form was described in 1586 by Blaise de Vigenère
Blaise de Vigenère
Blaise de Vigenère was a French diplomat and cryptographer. The Vigenère cipher is so named due to the cipher being incorrectly attributed to him in the 19th century....
with a similar reciprocal table of ten alphabets.
One popular form of autokey starts with a tabula recta
Tabula recta
In cryptography, the tabula recta is a square table of alphabets, each row of which is made by shifting the previous one to the left...
, a square with 26 copies of the alphabet, the first line starting with 'A', the next line starting with 'B', etc., like the one above. In order to encrypt a plaintext, one locates the row with the first letter to be encrypted, and the column with the first letter of the key. The letter where the line and column cross is the ciphertext letter.
Giovan Battista Bellaso
Giovan Battista Bellaso
-Biography:Bellaso was born of a distinguished family in 1505. His father was Piervincenzo, a patrician of Brescia, owner since the 15th century of a house in town and a suburban estate in Capriano, in a neighborhood called Fenili Belasi , including the Holy Trinity chapel. The chaplain was...
used the first letter of each word as a primer to start his text autokey. Blaise de Vigenère
Blaise de Vigenère
Blaise de Vigenère was a French diplomat and cryptographer. The Vigenère cipher is so named due to the cipher being incorrectly attributed to him in the 19th century....
used as a primer an agreed-upon single letter of the alphabet.
The autokey cipher as used by the members of the American Cryptogram Association
American Cryptogram Association
The American Cryptogram Association is an American non-profit organization devoted to the hobby of cryptography, with an emphasis on types of codes, ciphers, and cryptograms that can be solved either with pencil and paper, or with computers, but not computer-only systems.-History:The ACA was formed...
is in the way the key is generated. It starts with a relatively short keyword, and appends the message to it. So if the keyword is "QUEENLY", and the message is "ATTACK AT DAWN", the key would be "QUEENLYATTACKATDAWN".
Plaintext: ATTACK AT DAWN...
Key: QUEENL YA TTACK AT DAWN....
Ciphertext: QNXEPV YT WTWP...
The ciphertext message would therefore be "QNXEPVYTWTWP".
Cryptanalysis
Using an example message "meet at the fountain" encrypted with the keyword "KILT":plaintext: MEETATTHEFOUNTAIN (unknown)
key: KILTMEETATTHEFOUN (unknown)
ciphertext: WMPMMXXAEYHBRYOCA (known)
We try common
Letter frequencies
The frequency of letters in text has often been studied for use in cryptography, and frequency analysis in particular. No exact letter frequency distribution underlies a given language, since all writers write slightly differently. Linotype machines sorted the letters' frequencies as etaoin shrdlu...
words, bigram
Bigram
Bigrams or digrams are groups of two written letters, two syllables, or two words, and are very commonly used as the basis for simple statistical analysis of text. They are used in one of the most successful language models for speech recognition...
s, trigram
Trigram
Trigrams are a special case of the N-gram, where N is 3. They are often used in natural language processing for doing statistical analysis of texts.-Frequency:The 16 most common trigrams in English are:-Examples:...
s etc. in all possible positions in the key. For example, "THE":
ciphertext: WMP MMX XAE YHB RYO CA
key: THE THE THE THE THE ..
plaintext: DFL TFT ETA FAX YRK ..
ciphertext: W MPM MXX AEY HBR YOC A
key: . THE THE THE THE THE .
plaintext: . TII TQT HXU OUN FHY .
ciphertext: WM PMM XXA EYH BRY OCA
key: .. THE THE THE THE THE
plaintext: .. WFI EQW LRD IKU VVW
We sort the plaintext fragments in order of likelihood:
unlikely <------------------> promising
EQW DFL TFT ... ... ... ... ETA OUN FAX
We know that a correct plaintext fragment will also appear in the key, shifted right by the length of the keyword. Similarly our guessed key fragment ("THE") will also appear in the plaintext shifted left. So by guessing keyword lengths (probably between 3 and 12) we can reveal more plaintext and key.
Trying this with "OUN" (possibly after wasting some time with the others):
shift by 4:
ciphertext: WMPMMXXAEYHBRYOCA
key: ......ETA.THE.OUN
plaintext: ......THE.OUN.AIN
by 5:
ciphertext: WMPMMXXAEYHBRYOCA
key: .....EQW..THE..OU
plaintext: .....THE..OUN..OG
by 6:
ciphertext: WMPMMXXAEYHBRYOCA
key: ....TQT...THE...O
plaintext: ....THE...OUN...M
We see that a shift of 4 looks good (both of the others have unlikely Qs), so we shift the revealed "ETA" back by 4 into the plaintext:
ciphertext: WMPMMXXAEYHBRYOCA
key: ..LTM.ETA.THE.OUN
plaintext: ..ETA.THE.OUN.AIN
We have a lot to work with now. The keyword is probably 4 characters long ("..LT"), and we have some of the message:
M.ETA.THE.OUN.AIN
Because our plaintext guesses have an effect on the key 4 characters to the left, we get feedback on correct/incorrect guesses, so we can quickly fill in the gaps:
MEETATTHEFOUNTAIN
The ease of cryptanalysis is thanks to the feedback from the relationship between plaintext and key. A 3-character guess reveals 6 more characters, which then reveal further characters, creating a cascade effect, allowing us to rule out incorrect guesses quickly.
Autokey in modern ciphers
Modern autokey ciphers use very different encryption methods, but they follow the same approach of using either key bytes or plaintext bytes to generate more key bytes. Most modern stream cipherStream cipher
In cryptography, a stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream . In a stream cipher the plaintext digits are encrypted one at a time, and the transformation of successive digits varies during the encryption...
s are based on pseudorandom number generator
Pseudorandom number generator
A pseudorandom number generator , also known as a deterministic random bit generator , is an algorithm for generating a sequence of numbers that approximates the properties of random numbers...
s: the key is used to initialize the generator, and either key bytes or plaintext bytes are fed back into the generator to produce more bytes.
Some stream cipher
Stream cipher
In cryptography, a stream cipher is a symmetric key cipher where plaintext digits are combined with a pseudorandom cipher digit stream . In a stream cipher the plaintext digits are encrypted one at a time, and the transformation of successive digits varies during the encryption...
s are said to be "self-synchronizing", because the next key byte usually depends only on the previous N bytes of the message. If a byte in the message is lost or corrupted, therefore, the key-stream will also be corrupted--but only until N bytes have been processed. At that point the keystream goes back to normal, and the rest of the message will decrypt correctly.