Ogres, like onions, cakes, and operating systems are layered. Topics with many layers are easy to learn: start from the top, understand it fully, then go onto the next one. Cryptography is not one of these. While operating system development may have 30 or 40 layers, I believe crypto has 4:
- Classes: conceptual, theoretical groups of algorithms
- Mechanisms: components of said groups; that is, the actual algorithms
- Problems: the mathematical problems that mechanisms exploit to achieve security
- The maths behind it all
In order to understand cryptography and use it correctly, you start at the top. It is not until you understand it fully that you can go down the stack. Furthermore, cryptography is extremely binary: either you completely understand the first two layers, or you understand nothing at all. Cryptography is either all of it or it's nothing.
I've been studying cryptography for many years on my own, and since mid 2023 I professionally work in the crypto department of a cybersecurity firm, so you can trust the advice given in this post. From my experience, I believe many cybersecurity professionals outside of this field have a very thin understanding on the subject, so my motivation behind this is to give a good kickstart for all these people. This post also aims to inflict fear on the reader (that's you) and highlight how important it is to study the subject before having the chance to commit crypto-oopsies, they are surprisingly common.
This post will cover the different classes of cryptographic algorithms. Once fully understood, this is enough to get started on learning the rest. I might write more posts in the future for the consecutive layers. It will probably never happen, but in case you're interested, get notified
here.This post will cover applied cryptography and actually useful things to know. No modular exponentiation, Caesar's cipher, or Fermat's little theorem.
A channel is a means of data transmission. A secure channel (SC) is a channel that is resistant to spying and tampering. There are three desirable properties that an SC may want to have. In most situations, all of them are sought after. These are:
- Confidentiality: guarantees that nobody who intercepts the messages can read them. A channel that offers only confidentiality is called a confidential channel.
- Integrity: guarantees that no tampering can take place. Integral channel.
- Authenticity: guarantees that the message was really sent by who's supposed to. Authentic channel.
Do not confuse authenticity (cryptographic property) with authentication (logical access permissions). Knowing this distinction, authenticity always implies integrity (but not the other way around!)
Most of the time, an SC has two ends: Alice and Bob. Secure channels can be engineered to have more parties (think of encrypted group chats) but these mechanisms are very specific, often can be avoided, and will not be covered in this post.
There are two superclasses of cryptographic mechanisms: symmetrical and asymmetrical. They refer to whether both parties use the same key to get what they want (symmetric) or not (asymmetric).
A cryptographic primitive is an algorithm that is designed and programmed from scratch. A cryptographic construction is a mechanism that uses one or more cryptographic primitives, and may perform additional calculations to achieve its purpose. This line is somewhat fuzzy, but helps in understanding how each algorithm works.
For each algorithm class, two examples will be given: an old broken one, and a modern not-yet-broken one. A person who dares to do crypto is cursed to always be up to date, as new attacks are discovered on a regular basis: assume that by the time you read this, all these algorithms have been broken and seek updated information. Good sources of information are given by governments in their cryptographic mechanism recommendations, published by their corresponding agencies:
- USA has NIST's, pretty good but heavily influenced by the NSA
- Europe has SOG-IS', which is boring and, for the most part, too conservative
- Germany has BSI's, which is very complete but conservative as well
- Spain has CCN's, which is very recent and, at the present time, a bit confusing. I believe it will improve soon
Governments will always recommend standardized algorithms. These are studied and clearly defined algorithms that work fine in most cases. Outside of the standards, you're on your own: some may be incredibly good (XChaCha20, Argon2) and some may be completely trivial to break. I would advise to stay out of non-standard mechanisms unless you know what you're doing and don't intend to pass any certifications.
Knowing what algorithms and parameterizations are secure is not enough, anyone who uses crypto must know about what I've began calling "cryptocohesion": the correct ways of chaining the algorithms together in order to build a secure cryptosystem. While there are general practices that are good to follow (such as not using the same key in a cipher and a MAC), most require specific knowledge for each algorithm, such as not reusing a nonce in Poly1305.
Symmetric mechanisms transform data. They may have any number of inputs and outputs. They reshape the data directly, in a non-interactive fashion from start to finish.
A symmetric key is a small chunk of bits used as if it were a physical key. A symmetric cipher is any function that encrypts or decrypts bytes using a symmetric key; thus, they offer confidentiality. They are also sometimes called "data encapsulation mechanisms" (DEMs). Security comes from the difficulty of:
- Encrypting or decrypting without the key
- Guessing the key
Inputs:
- The message, also called "plaintext"
- The key
- Some algorithms may also have a nonce (number only used once), IV (initialization vector), or salt. These three terms mean the same: a randomly generated value which must be unique for each message
Output: the encrypted message, also called "ciphertext".
Symmetric ciphers use the input key to derive internal keys which are used to actually encrypt the message. Depending on how they do so, there are two distinct categories:
- Stream ciphers will produce a stream of seemingly random bits that are combined (XORed) to the message, bit by bit. Examples are RC4 (broken) and ChaCha20 (not).
- Block ciphers will produce fixed length intermediate keys that are combined in a generally more complex way. Block ciphers only encrypt one very short message (or "block"), such as 128 bits. Examples include DES (broken) and AES (not). Block ciphers must be applied over and over throughout the message using a mode of operation. Examples are ECB (insecure) and CBC (not, but please don't use it and keep reading).
A hash function transform a message with arbitrary length into another with a fixed one, usually from 256 to 512 bits. They are one-way functions, often called PRF (pseudo-random functions). They offer integrity. Security comes from the difficulty of:
- Recovering the input from the output (preimage)
- Discovering another input that produces the same output (second preimage)
- Finding any two inputs that produce the same output (collision)
Inputs:
- The plain text
- Some hashes may support additional data that can be mixed with the plain text. This is generally used for creating contextualized ciphertexts; for instance, a secure application "CryptoChat" might set it to "CryptoChat-v1". This contributes to slightly higher security since generic rainbow tables cannot be used against contextual crypto. This term is used interchangeably with "customization string", "personalization string", or, in contrast to salt, "pepper", since this is fixed for all messages, unlike a nonce
Output: the hashed data, also known as "digest".
Examples are MD5 (broken) and SHA-3 (not).
Extendable-output functions (XOFs) are variable-length hash functions. The output can be extended to any desired length. In some algorithms, this length must be fixed before the process starts, while in others one can keep producing bit after bit of digest nonstop. This makes a difference in the output: in a fixed-beforehand algorithm, a 1024 bit output and a 1025 bit one will be completely different.
Inputs:
- The plain text
- The length of the output
- Possibly, additional data
Output: the variable-sized digest.
Examples are cSHAKE-256 (good) and raw Keccak (not broken, but certainly a bad idea).
A MAC, or Message Authentication Code, is a signed hash. Therefore, they offer authenticity. A symmetric key is somehow mixed with the hash of the message, in such a way that if the key is known, then the hash is trustworthy, and thus the message can be verified for integrity with additional authentication guarantees. To verify, either the MAC is computed again and compared to the received one, or the algorithm gives a faster way for checking it.
Inputs:
- The plain text
- A symmetric key
- Possibly, additional data
Output: the MAC.
Examples are KMAC (good) and CBC-MAC (dangerous).
AE mean Authenticated Encryption. They are cryptographic constructions built with a symmetric cipher and a MAC. If the MAC allows for additional data, then we call it AEAD, or Authenticated Encryption with Associated Data. AEAD is the pinnacle of general-purpose symmetric encryption, and it's what everyone uses (or should be using). An SC built with an AEAD guarantees the three properties. Additional security comes from guaranteed safe cryptocohesion.
Inputs:
- The plaintext
- A symmetric key
- A nonce
- Additional data
Outputs:
- The ciphertext
- The MAC
Sometimes, the MAC is joined directly with the ciphertext; in this case, there's only one output. If this is not the case, concatenating both is a common and valid practice. Do note that you should never concatenate things in crypto if you don't know what you're doing, it will expose you to well-known attacks.
Good examples are ChaCha20-Poly1305 and AES-GCM. Any combination of cipher and MAC you come up with is very likely insecure.
KDFs, Key Derivation Functions, derive one or more keys from an input. These generated keys, also called DKM (derived keying material), can then be used in other algorithms. While a KDF does not offer the sought-after cryptoproperties, they are a central point in all modern cryptosystems. Security comes from the difficulty of:
- Discovering the input from the DKMs
- Correlating in any way the different produced keys
There's a subset of KDFs known as password hashing algorithms. These are intentionally slow KDFs that are used for deriving keys from a non-random user-chosen password. Their lack of speed protects against brute-force attacks.
Inputs:
- The master key, passphrase, shared secret, or plaintext used as input
- A salt, nonce, or counter, used to produce different keys
- Often, associated data
Output: the derived keying material.
Examples include HKDF (good), PBKDF2 (good, passphrase-oriented), and any trivial hash of concatenations you come up with (very likely insecure).
KW is for Key Wrapping. A KW algorithm is a cryptographic construction that symmetrically encapsulates keys. They are used for protecting keys in untrusted storage or transmitting them over the wire. For the latter, KEMs are preferred, which are explained later. KWs offer confidentiality and integrity.
Inputs:
- The key to be protected
- Another key to protect the first, also called "master key"
Output: the encrypted key.
Examples are AESKW (good) and TDKW (bad).
Asymmetric mechanisms have two keys: a private key and a public key, this is called a "keypair". The public key is derived from the private key. Security often comes from the difficulty of recovering the private key from the public.
In an asymmetric encryption scheme, the public key is used to encrypt a message, and the private key is used to decrypt it. Therefore, this offers confidentiality. Asymmetric encryption can usually encrypt only very short messages, so the few cryptosystems that use this do so for initializing the secure channel, the actual messages are encrypted symmetrically. Nowadays, asymmetric encryption algorithms are very rarely used, and a different approach (KEM/KEX+signature) is preferred.
For encryption, inputs:
- The message
- The public key of the recipient
- Possibly a nonce
The output is the ciphertext.
For decryption, inputs:
- The ciphertext
- The private key
The output is the plaintext.
Examples are ElGamal (good) and RSA (good but dangerous if not used correctly).
In a digital signature algorithm, the private key is used to sign a message and the public key is used to verify it. This offers authentication. Note that it is discouraged to use the same keypair for encryption and signing: depending on the algorithm this is potentially insecure. Digital signatures are used everywhere and there are many algorithms for it.
For signing, inputs:
- The message
- The private key
- Possibly a nonce
The output is the signature, kind of like a MAC.
For verifying, inputs:
- The message
- Its signature
- The public key
The output is binary: pass or fail.
Examples are Ed25519 (good) and DSA (dangerous).
KEX is short for Key Exchange, one of the most important classes of modern cryptography. A key exchange algorithm focuses on generating a shared secret with two keypairs. Alice has her private key (a) and Bob's public key (B). Bob has his private key (b) and Alice's public key (A). Combining 'a' and 'B' gives the exact same result as combining 'b' and 'A'. Both parties, with completely different numbers, reach the same result.
The tricky point is to securely send the public keys over the wire: the channel must offer authenticity, because this is an interactive process and security does not begin with the first message. The shared secret must never be used as a key; instead, actual keys must be generated using a KDF.
If a key exchange is performed at the beginning of each messaging session, then it's called "ephemeral". An ephemeral KEX performed on an authentic channel offers a new cryptographic property: PFS, or Perfect Forward Secrecy; that is, if the outer SC is broken in the future, all previous sessions remain safe, because the actual key is never being sent.
Inputs:
- My private key
- The other party's public key (securely received)
Output: the shared secret.
Examples are X25519 (good) and non-EC Diffie-Hellman (a bit dangerous).
Give KEX a try in the real world!KEM is short for Key Encapsulation Mechanism. Unlike a key exchange, a KEM focuses on sending a symmetric key using asymmetric cryptography.
Most post-quantum mechanisms are KEMs.
Nowadays, asymmetric encryption is pretty much only used for the same purpose as a KEM. The latter is preferred because of their speed and simplicity (fewer pitfalls, harder to get wrong).
Inputs:
- My randomly generated symmetric key
- The other party's public key
- Possibly a nonce
Output: the encrypted symmetric key
Examples are ECIES (not actually a standalone KEM, but good) and FrodoKEM (very modern, its security is not yet 100% known).
RNG is short for Random Number Generator. In cryptography, we use CSPRNGs, or Cryptographically-Secure Pseudorandom Number Generators. These are algorithms capable of producing large quantities of indistinguishable and seemingly-random data (sometimes, terabytes of them) from a small input called "seed". Security comes from the difficulty of:
- Recovering the seed
- Guessing the previous or next produced bit
In the absolutely best case, a cryptographic algorithm is as secure as the random data that was put into it. If this random data was not generated correctly, then all security is lost. As an example, AES-256 gives 256 bits of security at maximum, but only if the used key has 256 bits of entropy.
Choosing a decent CSPRNG is important, but more so is gathering good entropy (for a good seed). This task is very difficult. True random does not exist, and one can only approach it so much. Flipping a coin is not random, but in most scenarios it's "random enough", since predicting a physical event is difficult. CPUs by themselves are incapable of producing random numbers good enough to use as entropy. While so-called TRNGs (true RNGs) exist, often in the form of physical hardware capable of generating good entropy, in the general use case randomness is gathered from external physical events. Computers find this easier to do with time measurements: mixing two clocks working at different frequencies with different accuracies is known to be a good entropy source. Mixing the extremely fast and accurate CPU clock with the slow and imprecise human brain gives very good results. The duration, speed, and acceleration of mouse movements and the time between keypresses are OK sources of entropy, or "pools". Several pools are mixed into what becomes the actual available entropy. This is implemented in all modern operating systems. However, giving an accurate estimation of entropy is difficult and most cryptographic certifications are very picky on what they allow. For the general personal use case, /dev/random in Linux is good enough (not urandom!)
Fun entropyExamples of CSPRNGs are CTR_DRBG (good), Mersenne-Twister (PRNG, not a CSPRNG, must never be used for crypto), and Dual_EC_DRBG (intentionally broken and published by the NSA in 2006).
Even if your algorithm choice is top-notch, your parameterization is well studied, your implementation is free of bugs, and your operating system is 100% secure, you're not safe.
Each time a CPU accesses memory or performs a jump in the code, it leaves a
footprint. This footprint may be in the form of very minor timing differences or power consumption fluctuations. It is known that accurately measuring these phenomena has a very real chance of leaking keys.
Good cryptographic implementations consider this and always consume the same amount of resources and perform exactly the same steps regardless of the input. No if statements.
The slow and seemingly increasing capacity of quantum computers has open some questions regarding the future-proof expectations on our cryptography. This is why a lot of effort is being put right now into inventing new quantum-resistant crypto. This effort is slowly producing promising results, with many post-quantum (PQ) algorithms being in the process of getting standardized, mostly by NIST.
This is an awkward situation. We do not trust our current algorithms for the future, but the PQ ones are very new and we haven't had a long time to study them. This is why we do "hybridization": a mix of a classical mechanism and a PQ one. Only if both are broken, an adversary can break the secure channel.
A secure channel can never be constructed from scratch starting from an insecure channel. Pre-shared information over a different secure channel is mandatory.
The best option for Alice for getting Bob's public key is asking him to give her a piece of paper with it. Since this is infeasible, having an intermediary, Chloe, is the second best choice. Chloe will give her public key to Alice in a piece of paper. It can also be a symmetric key, in which case it's called PSK, or pre-shared key. Later on, Chloe can sign Bob's public key and give it over an insecure channel to Alice. Since Alice trusts Chloe's public key, if she also trusts that Chloe is not a bad person, then she can be certain she has Bob's public key. Note how Chloe's keypair has created an authentic channel.
In the end, there's no way around the piece of paper. One must always get a key in hand in order to trust it. You connect to the internet securely because you already have the keys of many intermediaries, they were in your computer when you bought it. Here's a question that may trouble some: do you trust them?