Falling through the KRACKs – how standards compliance won’t save you
“If you think technology can solve your security problems, then you don’t understand the problems and you don’t understand the technology.”
The KRACK attack, unveiled on Oct 16 by Mathy Vanhoef from KU Leuven, is quickly shaping up to be the next cybersecurity bogeyman – it even has a scary name and logo! Attackers breaking into wireless networks protected by state-of-the-art security measures, reading or even manipulating confidential data in the air at will… not a pleasant thought.
Essentially, successful exploitation of this issue involves the attacker interfering in a Wi-Fi WPA2 handshake process (it affects all subtypes, from TKIP to RADIUS-enabled WPA-Enterprise), and tricking the client into reusing a key, which – through some cryptanalytic trickery – makes it possible for the attacker to decrypt traffic. In some cases, the attacker can also forge traffic, essentially allowing them to impersonate a server or deploy malware to clients by modifying the contents of files as they are being downloaded.
The weakness exposed by KRACK is clearly a protocol-level problem (not a simple programming bug), despite security / cryptographic protocols being probably the most heavily scrutinized rules systems in the world. In fact, security protocols typically undergo formal verification; the WPA2 four-way handshake, in particular, was formally proven to be secure and free of design flaws (see He et al  ), and no weaknesses were found in it over the last 14 years.
So how did it happen – how was a provably secure protocol broken? The answer – as is typical in software security, unfortunately – lies in architects and developers making incorrect assumptions.
Uncommon sense and common nonce-sense
The core of the KRACK attacks is key reinstallation: repeating the third message in the 4-way WPA handshake process to trick the victim into reusing a previously-used key. In short, the access point (AP) can send multiple copies of a protocol message during the handshake, and the client will take the last sent message as valid. While this is a valid use-case for dealing with noisy channels, it opens the door for foul play.
As long as the attacker has stored the content sent in earlier protocol runs and has knowledge about the possible plaintext the two parties may be exchanging (feasible, considering most of the text in an HTTP exchange is going to be known and/or predictable), they can use cryptanalytic techniques to decrypt the WPA-secured messages sent between the client and server without knowing the key. In case of some protocol variants (TKIP or GCMP), the attacker can go even further and forge/inject arbitrary packets into the communication as well. Incidentally, the up-and-coming high-throughput WiGig technology uses GCMP…
It is important to note that the formal analysis still holds true – this attack does not damage any of the defined security properties, i.e. the attacker is not able to steal or modify the key, forge handshake messages, or impersonate either the client or the AP. It is just that the concept of a nonce reuse attack using data from an earlier protocol run – resulting in the installation of a valid-but-old key – was not present in the formal models in the first place.
Smash the state (machine)
Some valid criticism about KRACK is that the exploit scenario is not very realistic – essentially, the attacker needs to maintain a man-in-the-middle position between the victim and the AP in many scenarios. This is not the first time a protocol vulnerability is tricky to exploit in practice; see also BEAST and its reliance on e.g. Java applets or Silverlight.
Unfortunately, it gets worse.
wpa_supplicant is a very popular open-source implementation of WPA/WPA2. It is used extensively on Linux-based platforms including desktops and mobile OSes such as Android 6 and above.
Executing the key reinstallation attack against a client using wpa_supplicant v2.4-5 (and even the currently latest version 2.6, with some modifications to the attack) gives an unexpected result: instead of reusing the earlier key, the session key will be set to a string of 0x00 characters! This is much worse than the original KRACK scenario, since in this case, the attacker will know the session key – it’ll be trivial to break the security of the WPA communication, forge messages, or impersonate the AP.
How did this happen? Let’s take a look at the function that installs the session key after receiving the third protocol message from the AP as of the introduction of the vulnerability, with the added code in bold:
Essentially, this function initializes the session key (pairwise transient key, PTK) by copying it into the WPA context (sm->ctx->ptk) from its temporary storage within the protocol state machine variable (sm->ptk) via wpa_sm_set_key. In the commit that introduced the vulnerability, the developer was adding some security by zeroizing sm->ptk when it was – assumedly – no longer needed. In normal situations, this fix worked great and would make the job of attackers harder. However, since the attacker in a key reinstallation attack sends the third protocol message multiple times, wpa_supplicant_install_ptk will be called multiple times as well – and at that point, the zeroed-out sm->ptk value will be copied over to the PTK in sm->ctx->ptk, effectively zeroing that out too. From that point, the client will use an all-zero key when communicating with the server – that basically destroys message confidentiality! The developer fixed the issue in a later version by adding a tk_to_set flag, setting it to 1 in step 1 of the protocol, and modifying wpa_supplicant_install_ptk to check for the existence of this flag before setting the key (and setting it to 0 if setting the key was successful):
At first glance, this looked like a good solution to the issue. However, an attacker could still trigger this problem by re-sending the first protocol message first to reset the flag, then sending the third message again. The developer implemented a correct fix this week by adding an installed field to the ptk structure directly, and setting it to 1 once it was successfully installed.
Needless to say, if you’re using wpa_supplicant, you should follow the steps in the advisory – basically, merge the bugfix commits and rebuild.
Just because you’re paranoid…
Certification and standards compliance should never be considered the equivalent of (or worse, superset of) security. Certification is certainly good to set a security baseline and act as a motivator to improve security, but it should never be the be-all and end-all! Unexpected issues like KRACK can pop up at any time, and all certification and compliance processes can do is follow along – in this case, the Wi-Fi certification process will be updated to address this issue in the future.
By the way, this dovetails well with the ROCA vulnerability – also published this week – where a software library used in Infineon’s trusted platform modules was found to generate vulnerable RSA keys, potentially jeopardizing entire companies that relied on the TPMs to keep their data secure. These devices were all NIST FIPS-140-2 and Common Criteria EAL5+ certified, yet a critical weakness like this could still slip through the cracks. This is not a fault of FIPS or Common Criteria – rather, it highlights that we should subject critical software libraries to heavy scrutiny even if they have been certified for a particular purpose.
The most important takeaway is that – while defensive programming and defense in depth are essential in robustly handling edge cases – nothing compares to applying common sense due to the inevitable incompleteness of any specification. Don’t just assume that a circumstance not mentioned in the specification is impossible; if something seems to be possible while coding a particular function, be prepared for it. Of course in many cases it is not easy to realize this, as the number of ways something can go wrong is essentially infinite. In this particular case, the attacker can force your programs down an execution path you didn’t expect by sending the third message in the handshake repeatedly, which would be considered a state machine logic error – and such vulnerabilities are hard to recognize, but quite easily exploited once found.