The Attack On SHA1 And Its Implications
by Arnold Reinhold, Principal
A team of researchers from Shandong University in China, Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu, has announced an attack on the cryptographic hash algorithm SHA1. This is big news in the cryptographic community, but what are the implications for computer security, the enterprise and individual users?
Cryptographic hash algorithms are not intended to achieve secrecy, but to prove authenticity. A data file can be fed through a hash algorithm and the number that the hash puts out serves an an identifying fingerprint for the file and the document it contains. If even one bit in the file is changed, the hash should be completely different. Hashes can be sent via a separate channel to verify that a file is the same as an original. They are a vital component of electronic signatures.
Common cryptographic hash algorithms include MD5 and SHA1. MD5 was designed by Ron Rivest, of RSA Labs, as an improvement on a previous Rivest design, MD4. SHA1 was designed by the U.S. National Security Agency (NSA), also based on MD4. NSA had perviously introduced SHA, the Secure Hash Algorithm, but withdrew it without explanation and replaced it with SHA1. Everyone at the time assumed they had found a flaw in the original SHA, now known as SHA0.
Modern ciphers and hash algorithms are all constructed in the same general way. The inputs to the algorithm are passed through a series of computational steps that mix the bits up in various ways. The process is repeated a number of times and what results becomes the algorithm’s output. The intent is to make it hard for a person in possession of the output to recover any information about the input. All such algorithms are subject to a “brute force attack.” This means trying possible inputs until there is a match with the known output. If there are a large enough range of possibilities, the likelihood that an answer will be found in an acceptable amount of time and cost by brute force methods will be very small.
When people talk about “128 bit encryption” they mean that the secret key input is an number between 1 and two to the 128th power, or 340282366920938463463374607431768211456. That number of possibilities will defy any digital computing technology (quantum computing, if it ever becomes feasible, is another story). However, the fact that an algorithm has enough bits to withstand a brute force attack does not make it secure. There is still the possibility that some careful mathematical analysis of the mixing steps will reveal ways to simplify the search. Such short cuts have been discovered for many cryptographic algorithms, causing some to be rejected. Finding some short cuts is a common enough occurrence that it should now be expected for any cryptographic, absent some mathematical proof that it is impossible, and such proofs have not been found to date.
The way a cryptographic hash is attacked is by trying to produce two different documents that have the same hash, a so-called “collision.” An ability to produce collisions defeats the security of hashes. An attacker could produce two documents, one innocent and one dangerous that have the same hash. If the attacker gets a victim to sign the innocent documents electronically and he can “prove” the victim signed the hazardous document.
One measure of the security of a hash function is the number of bits in its output. MD4 and MD5 ave 128 bit outputs. SHA0 and SHA1 have 160 bit outputs. The more output bits, the harder it should be to find a collision. However, because of a mathematical result known as the “Birthday Paradox”, finding a collision takes takes far fewer steps than its output length might suggest. In general one need only try 2 to the N/2 possibilities, where N is the number of bits in the output. So a 128 bit hash only provides 64 bits of strength against brute force attacks . A 160 bit hash only provides 80 bits of strength, at most. Note that NSA introduced SHA0 and SHA1 in connection with their ill-fated Clipper proposal, which used an 80-bit encryption algorithm called Skipjack.
The attack on SHA1 is not a complete surprise. The U.S. National Institute on Science and Technology (NIST) had previously suggested that 80-bit encryption be phased out by 2015. In 2004, attacks of this nature were announced for MD5 and SHA0. In 2001, NSA introduced four new additions to the SHA family: SHA224, SHA256, SHA384 and SHA512, the numbers all being their output bit length. They are a Federal Information Processing Standard, FIPS-180-2 and are free to use.
The new result has several implications:
1. There is no need to panic. The Chinese results suggest that SHA1 has, at most, 69 bit strength, significantly less than the 80 bits, but still no easy matter to attack with current technology. As PGP’s Jon Callas said best: “It’s time to walk, but not run, to the fire exits. You don’t see smoke, but the fire alarms have gone off.” Give security vendors some time –a year or so–to develop upgrade plans.
2. SHA1 should not be used in new designs. The stronger variants, SHA256, SHA384 and SHA512 are widely available and free.
3. It is high time to phase out MD5. This even weaker algorithm is still widely used, for example, to validate source and object code for Linux software. (I published an attack against this application on Perry Metzger’s cryptography list in February, 2002.)
4. Consider replacing hexadecimal as the method to present hash outputs in human-readable form. In hex, MD5 hash outputs are 32 characters long. For SHA1 they are 40 characters. SHA256 would have 64 hex character outputs. Alternatives that are available include base 5 and base 6 presentation. Base 5 uses the ten digits plus the first 22 letters. Modified base 5 skips the letters i, l, o and z since they are easily confused with 8, 1, and 2.
Base 6, or Radix-64, uses all digits, upper-case and lower case letters, and two special characters, typically + and =. Both result in shorter hashes that are easier for humans to take in.