Notes on Zero-Knowledge Proofs and Secure Remote Password (SRP) Protocol

6 Aug 2024

Today I learned about using zero-knowledge proofs in the context of passwords. These are my rough-and-ready notes from reading. Apparently OpenSSL has an implementation of the SRP algorithm.

Math-based ZKP example #

Source for this example comes from Wikipedia. It might be good to read that in tandem with these notes.

In this example Peggy is the person who wishes to prove knowledge about something to Victor, the verifier. Peggy is proving that she knows some value \(x\) , but she doesn’t want to reveal the value of \(x\) .

Peggy and Victor need to share a large prime \(p\) and a generator \(g\) . (This means that \(g\) and \(p\) must be relatively prime.)

Peggy computes \(g^x \mod{p} = y\) and sends \(y\) to Victor.

Peggy generates a random number \(r\) and computes \(C = g^r \mod{p}\) and sends \(C\) to Victor.

Victor randomly issues one of two challenges:

Victor asks for \(r\) . Peggy sends him \(r\) and Victor verifies that \(C\) matches \(g^r \mod{p}\) .
Victor asks for \(s = (x + r) \mod{(p-1)}\) . Peggy computes this and sends the result to Victor. Victor checks that \(g^s \equiv (C \cdot y) \mod{p}\) .

Repeat process \(n\) times to drive the probability that Peggy was just guessing to \(\frac{1}{2^n}\) .

The Wikipedia article has a good explanation for how an attacker could not mimic knowing \(x\) with this interactive proof.

Digression: properties of exponents modulo a prime #

The last step works because

\(\begin{aligned} C \cdot y &\equiv g^r \cdot g^x &\mod{p} \\ &\equiv g^{r + x \mod{p-1}} &\mod{p} \end{aligned}\)

When working \(\mod{p}\) , operations on combining exponents are \(\mod{p-1}\) . This is a consequence of Fermat’s Little Theorem. Proof:

\(a^e = a^{p-1} \cdot a^{p-1} \cdot a^{p-1} \cdots a^n\)

Where \(a^{p-1}\cdot a^{p-1} \cdots a^{p-1} \equiv 1 \mod{p}\) by Fermat’s theorem, and \(n < p\) and \(e = m(p-1) + n\) by the division algorithm. Therefore, \(n \equiv e \pmod{p-1}\) .

ZKPs used for password-based authentication #

The above framework is not useful as-is for password authentication.

There is a method for verifying that a user knows a password without revealing the password to the server. The standard is called “SRP” (Secure Remote Password) and there’s at least a version 6. As far as I can tell, version 6 is the most up-to-date version as of writing.

Resources:

SRP Protocol Wikipedia article has a good explaination.

To convert between PostScript format (.ps) and PDF, run ps2pdf srp6.ps. On macOS I got the ps2pdf program by installing the ghostscript package via Homebrew.
SRP protocol design document; includes links to a paper that I followed. I can’t find this paper in any official publication registry. URL is: http://srp.stanford.edu/srp6.ps and the title is: “SRP-6: Improvements and Refinements to the Secure Remote Password Protocol”. Note: it comes in PostScript format, so you’ll likely want to convert it to PDF to read it.
SE post: “Why aren’t ZKPs used in practice for authentication?”, top answer is excellent.

Running SRP-6a #

Shared: large safe prime \(N\) (suggested that \(N = 2 * p + 1\) where \(p\) is prime) and primitive root \(g\) . (I.e., \(N\) and \(g\) must be relatively prime.)

In this algorithm, the values \(a\) and \(b\) will be randomly generated. At the end, both parties will have a secret key \(K\) that they share.

Client sends identifier \(I\) to Server.
Server looks up the salt and the verification token \((s,v)\) associated with \(I\) and sends just \(s\) to Client.
Client computes hash of salt, ID, and password \(x = H(s, I, P)\) .
Client generates a random value \(a\) and computes \(A = g^a\) . Client sends \(A\) to Server.
Server and client compute \(k = H(N, g)\) . This is an enhancement from the older SRP-6 algorithm.
Server generates a random value \(b\) and computes \(B = kv + g^b\) . Server sends \(B\) to client.
Server and Client both compute \(u = H(A,B)\) . This is called the scrambler.

Now both parties have access to \(A, B, k, u\) and \(g, N\) of course. With this they can each create a shared session key:

Client computation: \(K = (B-kg^x) ^ {(a + ux)}\)
Server computation: \(K = (Av^u)^b\)

The server and client must now both verify that they have the same value \(K\) . One simple way to do this is to hash \(K\) (potentially with some other salting information like \(A\) and \(B\) ) and transmit that. Example from the paper:

\( \begin{aligned} M_1 &= H(A,B,K) \\ M_2 &= H(A,M_1,K) \end{aligned}\)

The Client computes \(M_1\) and sends it to the server. The server should have enough information to recompute this value. Once that’s done, the server can compute \(M_2\) and send that back to the Client. (This last step is optional.) Now both parties know that they’ve got the right key. Use \(K\) as the session token.

The Wikipedia article mentions that it is important that the client send its proof of \(K\) (i.e., the proof is the value \(M_1\) ) first, and that the server should not reply with \(M_2\) if verification fails.

Here’s what the communication flow would look like:

Client \(I\) → Server
Server \(g,N,s,B\) → Client
Client \(A, M_1\) → Server
Server \(M_2\) → Client

Now the Server and Client can communicate using secret key \(K\) , which was only granted to the Client because it had the password that corresponded to the stored verifier \(v\) on the server.