by Marc Stevens

When significant weaknesses are found in cryptographic primitives on which the everyday security of the Internet relies, it is important that they are replaced by more secure alternatives, even if the weaknesses are only theoretical. This is clearly emphasized by our construction of a (purposely crippled) rogue Certification Authority (CA) in 2009 that in principle enabled the impersonation of all secure websites. This was possible due to the continued use of the insecure cryptographic hash function MD5 by a leading commercial CA. The hash function SHA-1, the successor to MD5 as the de facto hash function standard, has been theoretically broken since 2005. The Cryptology group at CWI has recently made a significant step towards a practical attack on SHA-1 that has long been anticipated, as well as efficient counter-measures against these cryptographic attacks.

Cryptographic hash functions, such as MD5, SHA-1 and SHA-2-256, are among the most important cryptographic primitives. A hash function is an algorithm that computes a hash value of a fixed number of bits (say 256 bits) for a message of arbitrary bit-length. A main application of hash functions is in digital signatures. In order for digital signatures to be secure, a hash function must satisfy the collision resistance property: it must be hard to find collisions, i.e., two different messages that map to the same hash value.

In 2004, collisions were found for MD5 by Wang et al. and despite practical limitations, MD5 was found to be insecure for continued use in applications. We have introduced the chosen-prefix collision attack in 2006 that removes limitations of the identical-prefix collision attack and thereby results in significantly more potential for realistic threats to the security of digital signatures. In particular, due to the slow response of the industry in removing MD5, we were able to construct a rogue Certification Authority (CA) in 2009. The certificate of our rogue CA was signed by an unsuspecting commercial CA. We used an improved version of the MD5 chosen-prefix collision attack to do this, thereby effectively breaking the security of secure websites (https://) [1].

A similar situation exists for SHA-1 now as for MD5 in 2005. It has been theoretically broken since 2005 due to a collision attack presented by Wang et al. with a complexity of 269 SHA-1 computations. Since then there have been several claims of improved attacks with complexities as low as 252 SHA-1 computations, however these were either not substantiated to date, withdrawn, or found to be too optimistic. Unfortunately, this means that the first attack essentially remains the state of the art in the literature. Lack of recent improvements suggests a barrier has been reached.

Figure 1: Detection of whether a message has been constructed using a collision attack on the cryptographic hash functions MD5 and/or SHA-1. This is done by partially reconstructing the hash computation of the (unknown) colliding sibling message (bottom half) and looking for the tell-tale condition of a collision (the comparison on the right).
Figure 1: Detection of whether a message has been constructed using a collision attack on the cryptographic hash functions MD5 and/or SHA-1. This is done by partially reconstructing the hash computation of the (unknown) colliding sibling message (bottom half) and looking for the tell-tale condition of a collision (the comparison on the right).

Recently we have introduced a new exact cryptanalysis of SHA-1 that, as prior methods, is based on the approach of local collisions from which an appropriate system of equations is obtained, which is then subsequently used in the search for an actual collision [2]. The novelty of our new methods is two-fold. First, in theory there are many eligible appropriate systems one may arrive at in this approach. Our analysis of such systems is exact and does not use heuristics compared to prior methods, in particular with respect to the dependence of local collisions and determining the complexity. Secondly, we show, for the first time, how to efficiently pick the system of equations that leads to the lowest complexity among all considered systems for a particular combination of local collisions. This is achieved by analyzing partitions of the set of all possible so called exact differential paths and using inherent redundancies to be able to compute the exact total probability of each partition. These probabilities are used to derive the optimal system of equations.

With our new identical-prefix collision attack based on our new cryptanalytic methods, we have shown how to substantially reduce the complexity of finding collisions for SHA-1 to about 261 SHA-1 computations. Though this is still just out of reach, this is a preliminary attack based on our new methods and the attack implementation can be further improved. The implementation of our SHA-1 attack is the first to be publicly available, thus its correctness and complexity can be publicly verified and also allows further understanding and improvements [3].

As MD5 and SHA-1 have significant (theoretical) weaknesses, they evidently should be withdrawn from applications. However, practice shows that the industry responds slowly in replacing them with secure hash function standards. To alleviate possible damage by collision attacks, we have introduced a technique that efficiently detects both identical-prefix and chosen-prefix collision attacks against both MD5 and SHA-1 given only one of the two documents in a collision. Such an indication can be used to abort further processing or communications, before sensitive information can be accessed or transmitted.

The future de facto hash function standard SHA-3 is currently being selected in an international competition by the National Institute of Standards and Technology (NIST) in the U.S. Nevertheless, due to the continued usage of SHA-1 in the foreseeable future, more research is needed on the real-world security of SHA-1 and on whether our ideas can be extended to other important hash function standards such as the future SHA-3.


[1] M. Stevens, A. Sotirov, J. Appelbaum, A. Lenstra, D. Molnar, D. A. Osvik, B. de Weger, “Short Chosen-Prefix Collisions for MD5 and the Creation of a Rogue CA Certificate”, CRYPTO 2009, Lecture Notes in Computer Science, vol. 5677, Springer, 2009, pp. 55-69. CR09-SSALMOdW.pdf

[2] M. Stevens; “Attacks on Hash Functions and Applications”, PhD thesis, Leiden University, June 19, 2012.

[3] Project HashClash, Marc Stevens,

Please contact:
Marc Stevens, Cryptology Group, CWI, The Netherlands
Tel: +31 20 592 4195
E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

{jcomments on}
Next issue: July 2023
Special theme:
"Eplainable AI"
Call for the next issue
Get the latest issue to your desktop
RSS Feed
Cookies user preferences
We use cookies to ensure you to get the best experience on our website. If you decline the use of cookies, this website may not function as expected.
Accept all
Decline all
Read more
Tools used to analyze the data to measure the effectiveness of a website and to understand how it works.
Google Analytics
Set of techniques which have for object the commercial strategy and in particular the market study.
DoubleClick/Google Marketing