Before proceeding, I want to stress that everything I refer to here relates to mistakes made when using (good) cryptographic libraries. The challenge of implementing the low-level cryptographic primitives themselves (like AES, RSA, ECC and so on) is a very different one, requiring high cryptographic engineering experience and knowledge. As such, this should be avoided whenever possible. In contrast, many software engineers need to just use cryptography in their work, and this cannot be avoided. Unfortunately, even this turns out to be far more problematic than expected.
I begin by describing the four main reasons why implementing cryptography is hard in my experience.
It is impossible to use software testing to see if an attacker can break the cryptographic protections implemented.
Reason 1 – the fact that it works means nothing cryptography:
There is a pretty typical way that software engineers learn something new: trial and error. If I have never set up a socket in C++ before, then what I would do is search online for some sample code and general explanations. Then, I’ll try it out in my code. It probably won’t work the first time, so I’ll play around with it and modify it until it works. Once it works, I can QA it, and it’s ready to go. This methodology works for all positive tasks, where the aim of the code is to achieve some functionality. However, cryptography is fundamentally different from all such tasks. In some sense, cryptography is not a positive task. I don’t want to ‘encrypt’ and then ‘decrypt’; rather, I want to prevent an attacker from being able to learn anything about a piece of data. This is a property that cannot be tested! Indeed, I can check that after I encrypt and decrypt, I get back the same data. But this doesn’t mean anything about whether I encrypted correctly (and the test would past even if the encryption and decryption functions did nothing at all). I could also check that the encrypted data looks like garbage. However, once again, this means very little. If I XORed the plaintext data with a string of alternating zeros and ones, it would look like garbage but would have no security at all. Going even further, I can use test vectors to make sure that my AES encryption really computes AES correctly. However, if I haven’t used AES in a secure mode of operation, then I am still vulnerable.
Thus, the method of functionally verifying that the code ‘works’ actually has very little meaning in the context of implementing cryptography. It is impossible to use software testing to see if an attacker can break the cryptographic protections implemented.
Machine identities, Venafi, and why being quantum ready is good strategy for today, and not just when quantum computers arrive
Quantum computing maybe be a few years off, but there is more to being quantum ready than preparing for that day. It boils down to machine identities, and finding a way to automate the process of changing these identities. We spoke to Venafi’s Kevin Bocek, an expert in threat detection, encryption, digital signatures and key management. He enlightened us further.
Reason 2 – the Internet
This may sound ridiculous, but if we could erase the internet and start from scratch, then this would go a long way to improving the security of deployed cryptography. The reason is simple – as I described above, when implementing something new, the primary source of software engineers is the internet.
However, a non-expert cannot distinguish a reliable source from an unreliable one. To make this worse, the top hits on the internet turn out to be insecure! I just searched ‘how to encrypt with AES’ and the top hit was a tutorial that uses CBC encryption (and no MAC). Although this is better than recommendations to use ECB mode (the fifth result in the search), the default should be authenticated encryption (like GCM or CCM). Thus, this actually is not a very good recommendation.
The case for RSA is even worse. When searching for ‘how to encrypt with RSA'[ the majority of the top hits just explain plain RSA with no padding, and a recent article from the end of 2018 recommends using keys of length at least 1024 bits. Furthermore, after explaining plain RSA, the article says that padding can sometimes be needed and “adding this padding before the message is encrypted makes RSA much more secure”.
The recommendation to encrypt with 2048-bit keys has been around for years, and it is extremely misleading to say that padded RSA is more secure (since it implies that plain RSA is also reasonably secure). Finally, when searching for “how to hash a file”, the top hits refer to MD5, SHA1 and SHA256, with only a few of them even bothering to say that MD5 and SHA1 are broken. These are just a couple of very brief examples demonstrating that wrong explanations are actually far more prevalent than right ones.
Reason 3 – cryptography is very brittle and software engineers are trained to optimise
The unfortunate fact is that small changes to cryptographic schemes can render them completely broken. There are many examples of this, and it means that any optimisations by software engineers can have disastrous effects. A classic example of this is the BEAST attack, which was made possible by an optimization to CBC where different messages were chained together in order to save choosing and sending a new IV with every message. This looks like a very reasonable optimization, and it took years to realize that it is insecure.
In addition, the combination of the subtleties of cryptographic implementation, the justified desire to optimise and simplify code, and the fact that much of the information online is incorrect, is a disaster. For example, even if a webpage provides different alternatives for encryption, it will rarely explain when to use which alternative. It is clearly more complex to use CBC or CTR than ECB, and even more complex to use GCM or CCM (or just combine encryption with message authentication). So, unless one understands why the addition of authentication is absolutely necessary, it would not even make sense to implement the secure version.
Reason 4 – cryptography is sometimes very hard
Although this is not the common case, sometimes it is actually very hard to choose the right cryptographic solution. For example, when encrypting a very large amount of data, secure modes of operation can break due to the birthday bound. This is more common if someone can only use 3DES (e.g., on a legacy device) and the fact that the block size is small means that the key must be rotated frequently or a stronger (beyond birthday bound) mode of operation used.
Another difficult example is what to do when encrypting on a device with a poor random source, and so the chance of an IV repeating is not low. Nonce-misuse resistant modes of operation do exist, but are less well known. The fact is that these cases and others are difficult and real expertise is needed to deal with them. Unfortunately, in most cases, the engineer implementing the cryptographic solution may not even be aware of the problem, let alone the solution.
Can blockchain really help create the internet of trust?
How can these risks to cryptography be mitigated?
There is no doubt that the best solution to the above is to take the effort to learn cryptography in depth before deploying it (e.g., take a reputable online course or read a good textbook). However, this is not always possible, and not all courses even cover everything needed in practice. Although not a replacement for the above, the following recommendations can go a long way towards minimizing cryptographic implementation errors:
1. Where possible, use high-level libraries only, and the highest-level API. Many libraries don’t support this and force developers to choose the mode of operation, padding and so on. However, some are better than others. For example, the EVP API in OpenSSL allows you to easily choose authenticated encryption modes, but you still need to know that these are preferable over other modes.
2. Find someone with the knowledge and build a single cryptographic library/interface for the entire company. Invest the resources to vet the implementations and make sure that they are correct. Then, provide a simple high-level API with functions like: genkey(), encrypt(), decrypt(), sign(), verify() and nothing else. The defaults should be the best possible, and developers should only need to choose between public key and private key operations (and should receive clear explanations as to when to choose what).
3. Be agile in your cryptographic implementations! Agility can mean many things in different contexts. In the context of cryptography, agility means the ability to easily change the low-level primitive, mode of operation and so on. Thus, the key length, ciphertext length, and so on should all be abstracted so that they can be changed relatively easily. This is important since an implementation flaw may be discovered later, the cryptographic algorithm being used may be broken, or the key length may need to be increased. By being agile, these can be replaced without too much difficulty.
4. If you do need online sources, I recommend crypto.stackexchange.com. It is not perfect and requires some basic knowledge to get started, but is generally very good.
One should not need to be an expert to implement cryptography. In some sense, this difficulty is a failure of the field. However, we also need to accept that it is somewhat inherent to the difficult nature of the material. Awareness and knowledge are paramount, and can significantly mitigate the risks, as can following a few basic principles, as outlined above.
Prof. Yehuda Lindell is the CEO and Co-Founder of Unbound Tech, and and professor in the Computer Science Department at Bar-Ilan University in Israel. He has carried out extensive research in cryptography, and has published over 100 conference and journal publications, as well as one of the leading undergraduate textbooks on cryptography.