the RSA assumption — alyssa chai

While at the University of Edinburgh I got to take a class in Modern Cryptography. Most of this class revolved around proving security around computational hardness. We got to work with PRGs and PRFs, reductions to security standards like CCA and CPA, and in the very last week there was a brief introduction to Zero Knowledge Proofs and proving security via simulations and computational indistinguishability!

Below is a not-so formal reduction of how RSA-OAEP is in fact IND-CPA secured by the RSA Assumption, which is the hardness of finding $e$-th roots modulo $N$. I thought this would be interesting because in class we only went as far as black box reductions, but this reduction relies on knowledge extraction. Compared to just constructing a Distinguisher $D$ that simply outputs whatever $\mathcal{A}$ outputs, $D$ has to keep track of and parse the results of $\mathcal{A}$ to make a decision.

The RSA-OAEP Protocol

Gen(1ⁿ) → (pk, sk)

Choose two large primes $p$ and $q$
Compute $N = pq$
Compute Euler's totient $\varphi(N) = (p-1)(q-1)$
Choose $e$ such that $1 < e < \varphi(N)$ and $\gcd(e, \varphi(N)) = 1$
Compute $d = e^{-1} \bmod \varphi(N)$
Output public key $\mathsf{pk} = (N, e)$ and secret key $\mathsf{sk} = (N, d)$

Enc(pk, m) → c

Compute $X = m \oplus G(r)$ and $Y = r \oplus H(X)$ where $r$ is a random nonce, and $G$ and $H$ are random oracles
The padded message is $\mu = (X \| Y)$
Output $c = \mu^e \bmod N$

Dec(sk, c) → m

Recover $\mu = c^d \bmod N$
Split into $X$ and $Y$, then recover $r = Y \oplus H(X)$
Recover $m = X \oplus G(r)$

IND-CPA Security Game Rules

Set Up

The challenger runs Gen to produce a public/private key pair (pk, sk). Like in the real world, the adversary knows the encryption scheme and the pk, the only thing kept secret is sk. The adversary outputs two messages $m_0$ and $m_1$ of equal length. The challenger picks a random bit $b \in \{0, 1\}$, encrypts $m_b$, and returns the challenge ciphertext $c^*$.

Query Phase

The adversary can submit any message $m_i$ of their choosing (where $m_i \neq m_0$ and $m_i \neq m_1$) to an oracle $\mathcal{O}$ which will return the respective ciphertext $c_i$. The adversary can continue making encryption queries as long as they occur within PPT.

Challenge Phase and Win Condition

$\mathcal{A}$ outputs a bit $b'$ and wins if $b' = b$ used in $m_b$.

The scheme is IND-CPA secure if for every PPT adversary, their advantage over random guessing is negligible:

$$\Pr\!\left(\mathsf{PrivK}^{\mathsf{CPA}}_{\mathcal{A},\Pi}(1^n) = 1\right) \leq \tfrac{1}{2} + \mathsf{negl}(n)$$

We can prove that this is in fact IND-CPA secure on the RSA assumption through a formal reduction.

Assumptions

Assume $\exists$ PPT adversary $\mathcal{A}$ that breaks IND-CPA of RSA-OAEP with non-negligible advantage:

$$\Pr\!\left(\mathsf{PrivK}^{\mathsf{CPA}}_{\mathcal{A},\Pi}(1^n) = 1\right) \geq \tfrac{1}{2} + h(n) \text{ for some non-negligible } h$$

Assume the RSA problem is hard meaning no PPT distinguisher $D$ can invert RSA with non-negligible advantage:

$$\Pr\!\left(D(N, e, y) = x : y = x^e \bmod N\right) \leq \mathsf{negl}(n)$$

Construct a Distinguisher $D$ that receives an RSA challenge $(N, e, y)$. $D$ wants to find an $x$ such that $x^e \equiv y \pmod{N}$. To do this, $D$ will use $\mathcal{A}$ which we assume can break RSA-OAEP. $\mathcal{A}$'s goal is to win the IND-CPA game written earlier.

Constructing Distinguisher D

$D$ receives an RSA challenge $(N, e, y)$ where $y = x^e \bmod N$ and must recover $x$. $D$ uses $\mathcal{A}$ as a subroutine.

$D$ forwards the public key $(N, e)$ to $\mathcal{A}$. $\mathcal{A}$ responds with two messages $m_0$ and $m_1$.
$D$ simulates the random oracles $H$ and $G$. For any fresh query, $D$ picks a uniform random string, records it in G-List or H-List, and returns it. For repeat queries, $D$ returns the same value.
$D$ sets the challenge ciphertext $c^* = y$ and forwards it to $\mathcal{A}$ as the encryption of $m_b$. $\mathcal{A}$ runs, making oracle queries throughout, which $D$ answers via its simulated oracles.
$D$ keeps a complete log of all queries $\mathcal{A}$ makes to $H$.

For $\mathcal{A}$ to distinguish $c^*$ with a non-negligible advantage, $\mathcal{A}$ must eventually query the random oracle $H$ on the correct value used during OAEP padding, which we will refer to as the AskH event. In a separate proof, which is omitted in this blog, you can actually prove that $\Pr[\mathsf{AskH}] \geq h(n) - \mathsf{negl}(n)$.

The intuition behind this though is that without querying $H(X)$, the adversary can't recover $r$ or learn anything useful about the message. Therefore, any adversary that distinguishes the challenge ciphertext from random must query $H$ on this value with non-negligible probability.

$D$ scans its H-List to find $\mu_i^e \equiv y \pmod{N}$, which gives the RSA preimage, so $D$ can just output $x = \mu$.

World 0: $y$ is a valid RSA ciphertext of $m_b$ and $\mathcal{A}$ perfectly simulates RSA-OAEP. Wins with probability $\geq \tfrac{1}{2} + h(n)$.
World 1: $y$ is a uniform random element of $\mathbb{Z}_N$ which is just random. $\mathcal{A}$ wins with probability exactly $\tfrac{1}{2}$ cause it can do no better than just guessing.

If $\mathcal{A}$ distinguishes World 0 from World 1 with non-negligible advantage $h(n)$, then it must have queried $H$, and $D$ recovers $x$ from that query log with the same probability:

$$\Pr[D \text{ inverts RSA}] \;=\; \Pr[\mathcal{A} \text{ wins IND-CPA}] \;\geq\; h(n) - \mathsf{negl}(n)$$

But by the RSA hardness assumption:

$$\Pr[D \text{ inverts RSA}] \;\leq\; \mathsf{negl}(n)$$ $$\mathsf{negl}(n) \;\geq\; h(n) - \mathsf{negl}(n) \quad\Longrightarrow\quad h(n) \;\leq\; \mathsf{negl}(n)$$

We run into a contradiction so no such $\mathcal{A}$ can exist and RSA-OAEP is IND-CPA secure.

And that's it! I would not call this a formal proof by any means as I definitely left out a few lemmas such as including the probability of oracle collisions, but I think this encapsulates the general idea.

To go a step further (which I did not do :D), you can also prove that RSA-OAEP is IND-CCA secure which is chosen as cipher text secure. This game functions the exact same way as IND-CPA except the adversary also gets access to a decryption oracle where it can decrypt any cipher texts, and provides a more stricter security definition because the adversary now has more power.