Crypt (Unix)
Encyclopedia
In Unix
Unix
Unix is a multitasking, multi-user computer operating system originally developed in 1969 by a group of AT&T employees at Bell Labs, including Ken Thompson, Dennis Ritchie, Brian Kernighan, Douglas McIlroy, and Joe Ossanna...

 computing, crypt is the name of both a utility program
Utility software
Utility software is system software designed to help analyze, configure, optimize or maintain a computer. A single piece of utility software is usually called a utility or tool....

 and a C programming
C (programming language)
C is a general-purpose computer programming language developed between 1969 and 1973 by Dennis Ritchie at the Bell Telephone Laboratories for use with the Unix operating system....

 function. Though both are used for encrypting data, they are otherwise essentially unrelated. To distinguish between the two, writers often refer to the utility program as crypt(1), because it is documented in section 1 of the Unix manual pages
Manual page (Unix)
Man pages are the extensive documentation that comes preinstalled with almost all substantial Unix and Unix-like operating systems. The Unix command used to display them is man. Each page is a self-contained document.- Usage :...

, and refer to the C library function as crypt(3), because its documentation is in manual section 3.

Command filter crypt(1)

crypt(1) is a simple command to encrypt or decrypt data. Usually this is used as a filter
Filter (Unix)
In Unix and Unix-like operating systems, a filter is a program that gets most of its data from its standard input and writes its main results to its standard output . Unix filters are often used as elements of pipelines...

 and it has traditionally been implemented using an algorithm
Algorithm
In mathematics and computer science, an algorithm is an effective method expressed as a finite list of well-defined instructions for calculating a function. Algorithms are used for calculation, data processing, and automated reasoning...

 based on the Enigma machine
Enigma machine
An Enigma machine is any of a family of related electro-mechanical rotor cipher machines used for the encryption and decryption of secret messages. Enigma was invented by German engineer Arthur Scherbius at the end of World War I...

. It is considered to be far too cryptographically
Cryptography
Cryptography is the practice and study of techniques for secure communication in the presence of third parties...

 weak to provide any security against brute force attacks by modern, commodity personal computer
Personal computer
A personal computer is any general-purpose computer whose size, capabilities, and original sales price make it useful for individuals, and which is intended to be operated directly by an end-user with no intervening computer operator...

s.

Some versions of Unix shipped with an even weaker version of the crypt(1) command in order to comply with contemporaneous laws and regulations which limited the exportation of cryptographic software (for example by classifying them as munitions). Some of these were simply implementations of the Caesar cipher
Caesar cipher
In cryptography, a Caesar cipher, also known as a Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number...

 (effectively no more secure than ROT13
ROT13
ROT13 is a simple substitution cipher used in online forums as a means of hiding spoilers, punchlines, puzzle solutions, and offensive materials from the casual glance. ROT13 has been described as the "Usenet equivalent of a magazine printing the answer to a quiz upside down"...

 which is implemented as a Caesar cipher with a well known key).

crypt(1) under Linux

Linux distribution
Linux distribution
A Linux distribution is a member of the family of Unix-like operating systems built on top of the Linux kernel. Such distributions are operating systems including a large collection of software applications such as word processors, spreadsheets, media players, and database applications...

s generally do not include a Unix compatible version of the crypt command. This is largely due to a combination of three major factors:
  1. crypt is relatively obscure and rarely used for e-mail attachments nor as a file format
  2. crypt is considered far too cryptographically weak to withstand brute force attack
    Brute force attack
    In cryptography, a brute-force attack, or exhaustive key search, is a strategy that can, in theory, be used against any encrypted data. Such an attack might be utilized when it is not possible to take advantage of other weaknesses in an encryption system that would make the task easier...

    s by modern computing systems (Linux systems generally ship with GNU Privacy Guard
    GNU Privacy Guard
    GNU Privacy Guard is a GPL Licensed alternative to the PGP suite of cryptographic software. GnuPG is compliant with RFC 4880, which is the current IETF standards track specification of OpenPGP...

     which is considered to be reasonably secure by modern standards)
  3. During the early years of Linux development and adoption there was some concern that even as weak as the algorithm used by crypt was, that it might still run afoul of ITAR's export controls; so mainstream distribution developers in the United States
    United States
    The United States of America is a federal constitutional republic comprising fifty states and a federal district...

     generally excluded it (and left their customers to fetch GnuPG/GPG or other strong cryptographic software from international sites, sometimes providing packages or scripts to automate that process).


The source code to several old versions of the crypt command is available in The Unix Heritage Society's Unix Archive.

Enhanced symmetric encryption
Symmetric-key algorithm
Symmetric-key algorithms are a class of algorithms for cryptography that use trivially related, often identical, cryptographic keys for both encryption of plaintext and decryption of ciphertext. The encryption key is trivially related to the decryption key, in that they may be identical or there is...

 utilities are available for Linux (and should also be portable
Software portability
Portability in high-level computer programming is the usability of the same software in different environments. The prerequirement for portability is the generalized abstraction between the application logic and system interfaces...

 to any other Unix-like
Unix-like
A Unix-like operating system is one that behaves in a manner similar to a Unix system, while not necessarily conforming to or being certified to any version of the Single UNIX Specification....

 system) including mcrypt
Mcrypt
mcrypt is a replacement for the popular UNIX crypt command.crypt was a file encryption tool that used an algorithm very close to theWorld War II enigma cipher, which was broken. Mcrypt provides the same functionality but uses several modern algorithms such as AES...

 and ccrypt
Ccrypt
ccrypt is a utility for the secure encryption and decryption of files and streams. It was designed as a replacement for the standard UNIX crypt utility, which is notorious for using a very weak encryption algorithm....

. While these provide support for much more sophisticated and modern algorithms, they can be used to encrypt and decrypt files which are compatible with the traditional crypt(1) command by providing the correct command line options.

Breaking crypt(1) encryption

Programs for breaking crypt(1) encryption are widely available. Bob Baldwin's Crypt Breaker's Workbench, which was written in 1984-1985, is an interactive tool that provides successive plaintext guesses that must be corrected by the user. Peter Selinger's unixcrypt-breaker uses a simple statistical model to guess plausible plaintexts, and does not require user interaction.

Library Function crypt(3)

crypt(3) is the library function which is used to compute a password hash
Cryptographic hash function
A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the hash value, such that an accidental or intentional change to the data will change the hash value...

 that can be used to store user account passwords while keeping them relatively secure (a passwd
Passwd (file)
In Unix-like operating systems the /etc/passwd file is a text-based database of information about users that may login to the system or other operating system user identities that own running processes....

file). The output of the function is not simply the hash— it is a text string which also encode
Encode
Encode may refer to:* Can be related to "Code"* Encode ApS, a Danish software company* Encode SA, a Greek information security company* ENCODE, the ENCyclopedia Of DNA Elements...

s the salt
Salt (cryptography)
In cryptography, a salt consists of random bits, creating one of the inputs to a one-way function. The other input is usually a password or passphrase. The output of the one-way function can be stored rather than the password, and still be used for authenticating users. The one-way function...

 (usually the first two characters), and identifies the hash algorithm used (defaulting to the "traditional" one explained below). This output string is what is meant for putting in a password record which may be stored in a plain text file.

This same crypt(3) function is used both to generate a new hash for storage and also to hash a proffered password with a recorded salt for comparison.

If the salt begins with the string $digit$ then the Modular Crypt Format is used. The digit represents which algorithm is used in encryption.

The crypt library function is also included in the Perl
Perl
Perl is a high-level, general-purpose, interpreted, dynamic programming language. Perl was originally developed by Larry Wall in 1987 as a general-purpose Unix scripting language to make report processing easier. Since then, it has undergone many changes and revisions and become widely popular...

http://perldoc.perl.org/functions/crypt.html, PHP
PHP
PHP is a general-purpose server-side scripting language originally designed for web development to produce dynamic web pages. For this purpose, PHP code is embedded into the HTML source document and interpreted by a web server with a PHP processor module, which generates the web page document...

http://us.php.net/manual/en/function.crypt.php, Pikehttp://pike.ida.liu.se/generated/manual/modref/ex/predef_3A_3A/crypt.html, Python
Python (programming language)
Python is a general-purpose, high-level programming language whose design philosophy emphasizes code readability. Python claims to "[combine] remarkable power with very clear syntax", and its standard library is large and comprehensive...

http://docs.python.org/library/crypt.html, and Ruby
Ruby (programming language)
Ruby is a dynamic, reflective, general-purpose object-oriented programming language that combines syntax inspired by Perl with Smalltalk-like features. Ruby originated in Japan during the mid-1990s and was first developed and designed by Yukihiro "Matz" Matsumoto...

http://ruby-doc.org/core/classes/String.html#M001174 programming languages.

Traditional DES-based scheme

The traditional implementation uses a modified form of the DES
Data Encryption Standard
The Data Encryption Standard is a block cipher that uses shared secret encryption. It was selected by the National Bureau of Standards as an official Federal Information Processing Standard for the United States in 1976 and which has subsequently enjoyed widespread use internationally. It is...

 algorithm. The user's password is truncated to eight characters, and those are coerced down to only 7-bits each; this forms the 56-bit DES key. That key is then used to encrypt an all-bits-zero block, and then the ciphertext is encrypted again with the same key, and so on for a total of 25 DES encryptions. A 12-bit salt is used to perturb the encryption algorithm, so standard DES implementations can't be used to implement crypt. The salt and the final ciphertext are encoded into a printable string in a form of base64
Base64
Base64 is a group of similar encoding schemes that represent binary data in an ASCII string format by translating it into a radix-64 representation...

.

This is technically not encryption since the data (all bits zero) is not being kept secret; it's widely known to all in advance. However, one of the properties of DES is that it's very resistant to key recovery even in the face of known plaintext situations. It is theoretically possible that two different passwords could result in exactly the same hash. Thus the password is never "decrypted": it is merely used to compute a result, and the matching results are presumed to be proof that the passwords were "the same."

The advantages of this method have been that the password can be stored in plain text and copied among Unix systems without being exposed to the system administrators or other users. This portability has worked for over 30 years across many generations of computing architecture, and across many versions of Unix from many vendors.

Modifications of the traditional scheme

crypt(3) was originally chosen because DES was resistant to key recovery even in the face of "known plaintext" attacks, and because it was computationally expensive. On the earliest Unix machines it took over a full second to compute a password hash. This also made it reasonably resistant to dictionary attack
Dictionary attack
In cryptanalysis and computer security, a dictionary attack is a technique for defeating a cipher or authentication mechanism by trying to determine its decryption key or passphrase by searching likely possibilities.-Technique:...

s in that era. At that time password hashes were commonly stored in an account file (/etc/passwd) which was readable to anyone on the system. (This account file was also used to map user ID numbers into names, and user names into full names, etc.).

In the three decades since that time, computers have become vastly more powerful. Moore's Law
Moore's Law
Moore's law describes a long-term trend in the history of computing hardware: the number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years....

 has generally held true, so the computer speed and capacity available for a given financial investment has doubled over 20 times since Unix was first written. This has long since left the crypt(3) function vulnerable to dictionary attacks, and Unix and Unix-like systems such as Linux
Linux
Linux is a Unix-like computer operating system assembled under the model of free and open source software development and distribution. The defining component of any Linux system is the Linux kernel, an operating system kernel first released October 5, 1991 by Linus Torvalds...

 have used "shadow" files
Shadow password
In computing, Unix-like operating systems use the shadow password database mechanism to increase the security level of passwords by restricting all but highly privileged users' access to encrypted password data...

 for a long time, migrating just the password hash values out of the account file (/etc/passwd) and into a file (conventionally named /etc/shadow) which can only be read by privileged processes.

To increase the computational cost of password breaking, some Unix sites privately started increasing the number of encryption rounds on an ad hoc basis. This had the side effect of making their crypt incompatible with the standard crypt: the hashes had the same textual form, but were now calculated using a different algorithm. Some sites also took advantage of this incompatibility effect, by modifying the initial block from the standard all-bits-zero. This did not increase the cost of hashing, but meant that precomputed hash dictionaries based on the standard crypt could not be applied.

BSDi extended DES-based scheme

To gain greater cryptographic security and resistance to brute-force attacks, modern versions of Unix now have a variety of new password hash schemes implemented using the crypt interface. BSDi
Berkeley Software Design
Berkeley Software Design Inc. was a corporation which developed, sold licenses for, and supported BSD/OS , a commercial and partially proprietary variant of the BSD Unix operating system for PC compatible computer systems...

 modified the original DES-based scheme, extending the salt to 24 bits and making the number of rounds variable (up to 224-1). The chosen number of rounds is encoded in the stored password hash, avoiding the incompatibility that occurred when sites modified the number of rounds used by the original scheme. These hashes are identified by starting with _.

The BSDi algorithm also supports longer passwords, using DES to fold the initial long password down to the eight 7-bit bytes supported by the original algorithm.

MD5-based scheme

Poul-Henning Kamp
Poul-Henning Kamp
Poul-Henning Kamp is a Danish FreeBSD developer, responsible for implementation of the widely used MD5 password hash algorithm, a vast quantity of systems code, including the FreeBSD GEOM storage layer, GBDE cryptographic storage transform, part of the UFS2 file system implementation, FreeBSD...

 designed a baroque and (at the time) computationally expensive algorithm based on the MD5
MD5
The MD5 Message-Digest Algorithm is a widely used cryptographic hash function that produces a 128-bit hash value. Specified in RFC 1321, MD5 has been employed in a wide variety of security applications, and is also commonly used to check data integrity...

 message digest algorithm. MD5 itself would provide good cryptographic strength for the password hash, but it is designed to be quite quick to calculate relative to the strength it provides. The crypt scheme is designed to be expensive to calculate, to slow down dictionary attacks. The printable form of MD5 password hashes starts with $1$.

This scheme allows users to have any length password, and they can use any characters supported by their platform (not just 7-bit ASCII). (In practice many implementations limit the password length, but they generally support passwords far longer than any person would be willing to type.) The salt is also an arbitrary string, limited only by character set considerations.

First the passphrase and salt are hashed together, yielding an MD5 message digest. Then a new digest is constructed, hashing together the passphrase, the salt, and the first digest, all in a rather complex form. Then this digest is passed through a thousand iterations of a function which rehashes it together with the passphrase and salt in a manner that varies between rounds. The output of the last of these rounds is the resulting passphrase hash.

The fixed iteration count has caused this scheme to lose the computational expense that it once enjoyed. Variable numbers of rounds are now favoured.

Blowfish-based scheme

Niels Provos
Niels Provos
Niels Provos is a researcher in the areas of secure systems, malware and cryptography. He is currently a Principal Software Engineer at Google. He received his PhD in Computer Science from the University of Michigan....

 and David Mazières designed a crypt scheme called bcrypt
Bcrypt
bcrypt is an adaptive cryptographic hash function for passwords designed by Niels Provos and David Mazières, based on the Blowfish cipher, and presented at USENIX in 1999...

 based on Blowfish
Blowfish (cipher)
Blowfish is a keyed, symmetric block cipher, designed in 1993 by Bruce Schneier and included in a large number of cipher suites and encryption products. Blowfish provides a good encryption rate in software and no effective cryptanalysis of it has been found to date...

, and presented it at USENIX
USENIX
-External links:* *...

 in 1999. The printable form of these hashes starts with $2$ or $2a$, depending on which variant of the algorithm is used.

Blowfish is notable among block ciphers for its expensive key setup phase. It starts off with subkeys in a standard state, then uses this state to perform a block encryption using part of the key, and uses the result of that encryption (really, a hashing) to replace some of the subkeys. Then it uses this modified state to encrypt another part of the key, and uses the result to replace more of the subkeys. It proceeds in this fashion, using a progressively modified state to hash the key and replace bits of state, until all subkeys have been set.

Provos and Mazières took advantage of this, and actually took it further. They developed a new key setup algorithm for Blowfish, dubbing the resulting cipher "Eksblowfish" ("expensive key schedule Blowfish"). The key setup begins with a modified form of the standard Blowfish key setup, in which both the salt and password are used to set all subkeys. Then there is a configurable number of rounds in which the standard Blowfish keying algorithm is applied, using alternately the salt and the password as the key, each round starting with the subkey state from the previous round. This is not cryptographically significantly stronger than the standard Blowfish key schedule; it simply makes the algorithm arbitrarily slow to deter brute-force attacks.

The number of rounds of keying is a power of two, which is an input to the algorithm. The number is encoded in the textual hash.

NT Hash Scheme

FreeBSD implemented support for the NT LAN Manager
NTLM
In a Windows network, NTLM is a suite of Microsoft security protocols that provides authentication, integrity, and confidentiality to users....

 hash algorithm to provide easier compatibility with NT accounts. The NT-Hash algorithm is known to be weak, as it uses the deprecated md4
MD4
The MD4 Message-Digest Algorithm is a cryptographic hash function developed by Ronald Rivest in 1990. The digest length is 128 bits. The algorithm has influenced later designs, such as the MD5, SHA-1 and RIPEMD algorithms....

 hash algorithm without any salting. FreeBSD used the $3$ prefix for this. Its use is not recommended, as it is easily broken.

SHA2-based scheme

The commonly used MD5 based scheme has become easier to attack as computer power has increased. Although the Blowfish-based system has the option of adding rounds and thus remain a challenging password algorithm, it does not use a NIST
National Institute of Standards and Technology
The National Institute of Standards and Technology , known between 1901 and 1988 as the National Bureau of Standards , is a measurement standards laboratory, otherwise known as a National Metrological Institute , which is a non-regulatory agency of the United States Department of Commerce...

-approved algorithm. In light of these facts, Ulrich Drepper of Red Hat
Red Hat
Red Hat, Inc. is an S&P 500 company in the free and open source software sector, and a major Linux distribution vendor. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina with satellite offices worldwide....

 led an effort to create a scheme based on the SHA-2
SHA-2
In cryptography, SHA-2 is a set of cryptographic hash functions designed by the National Security Agency and published in 2001 by the NIST as a U.S. Federal Information Processing Standard. SHA stands for Secure Hash Algorithm. SHA-2 includes a significant number of changes from its predecessor,...

 (SHA-256 and SHA-512) hash functions. The printable form of these hashes starts with $5$ or $6$ depending on which SHA variant is used. Its design is similar to the MD5-based crypt, with a few notable differences:
  • It avoids adding constant data in a few steps.
  • The MD5 algorithm would repeatedly add the first letter of the password; this step was changed significantly.
  • Inspired by Sun's
    Sun Microsystems
    Sun Microsystems, Inc. was a company that sold :computers, computer components, :computer software, and :information technology services. Sun was founded on February 24, 1982...

     crypt implementation, functionality to specify the number of rounds the main loop in the algorithm performs was added


The specification and sample code have been released into the public domain.

crypt(3) under Linux

The GNU C Library
GNU C Library
The GNU C Library, commonly known as glibc, is the C standard library released by the GNU Project. Originally written by the Free Software Foundation for the GNU operating system, the library's development has been overseen by a committee since 2001, with Ulrich Drepper from Red Hat as the lead...

 used by almost all Linux distributions provides an implementation of the crypt function which supports the DES, MD5, and SHA-2 (since version 2.7) based hashing algorithms mentioned above.

External links

  • Source code for crypt(1) from Seventh Edition Unix (trivialised one-rotor Enigma-style machine)
  • Source code for crypt(3) from Seventh Edition Unix (implements proposed DES)
  • Source code for crypt(1) from Sixth Edition Unix (implementation of Boris Hagelin
    Boris Hagelin
    Boris Caesar Wilhelm Hagelin was a Swedish businessman and inventor of encryption machines.Born of Swedish parents in the Caucasus , Hagelin attended Lundsberg boarding school and later studied mechanical engineering at the Royal Institute of Technology in Stockholm, graduating in 1914...

    's cryptographic machine)
The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK