Pedersen Commitments, Divide-and-Conquer, and the Inner Product Argument

Dankrad Feist has a terrific post explaining Pedersen commitments, a commitment scheme that works on an elliptic curve. The Pedersen commitment ecosystem includes the inner product argument (a protocol for proving that a prover has correctly computed the inner product of two committed vectors) and Bulletproofs (a full proof system built on the inner product argument).

In short, the Pedersen commitment lets you commit to a vector of arbitrary length $n$ by giving just a single group element $g \in G$ . The inner product argument takes as input two committed vectors $\mathbf{a}$ and $\mathbf{b}$ – the prover knows both $\mathbf{a}$ and $\mathbf{b}$ , but the verifier only knows the commitments $\operatorname{Com}(\mathbf{a})$ and $\operatorname{Com}(\mathbf{b})$ , which are elements of $G$ – and allows the prover to prove to the verifier that their inner product has a certain value. The proof uses a divide-and-conquer trick: at each step, you break the vector of length $n$ into two vectors of length $n/2$ , and then you reduce the problem to an inner product of vectors of length $n/2$ .

The goal of this note is to explain how you would think of the inner product argument. We’ll start with a simpler question: when the prover sends the verifier a commitment to a vector $g = \operatorname{Com}(\mathbf{a})$ , it just looks like a plain old group element. How can the prover prove she knows the vector $\mathbf{a}$ , without actually revealing $\mathbf{a}$ ? This will lead us to a simpler version of the divide-and-conquer trick. Then we’ll see how a more complex application of the same algebra gives the inner product argument as well.

Pedersen commitments

To start with, let’s recall how Pedersen commitments work. Suppose you have a group $G$ – concretely, let’s say $G$ is an elliptic curve over some large finite field. Fix once and for all a bunch of “random” elements of $G$ , say $g_1, g_2, \ldots, g_N$ . Now suppose you want to commit to a vector $\mathbf{a} = (a_1, \ldots, a_n)$ , for some $n \leq N$ . Your commitment is simply the group element

$\operatorname{Com}(\mathbf{a}) = a_1 g_1 + a_2 g_2 + \cdots + a_n g_n.$

To open your commitment, you simply reveal the full vector $(a_1, \ldots, a_n)$ , and the other party can verify that the commitment matches the vector.

Why does this work? You want a commitment scheme to have two properties: hiding and binding. Hiding means that it’s hard to recover $\mathbf{a}$ from the commitment $\operatorname{Com}(\mathbf{a})$ . Binding means that the committer is committed to $\mathbf{a}$ : it’s hard to find two vectors $\mathbf{a}$ and $\mathbf{a}'$ such that $\operatorname{Com}(\mathbf{a}) = \operatorname{Com}(\mathbf{a}')$ . We’ll see that our scheme is binding but not hiding.

Let’s assume that the discrete logarithm problem is hard in $G$ . For us, this means it’s hard to find $x_1, \ldots, x_n$ such that $x_1 g_1 + \cdots + x_n g_n = 0$ . (This is widely believed to be true if $G$ is a large elliptic curve, for example.) This means it’s hard to find $\mathbf{x}$ such that $\operatorname{Com}(\mathbf{x}) = 0$ . But if $\operatorname{Com}(\mathbf{a}') = \operatorname{Com}(\mathbf{a})$ , then $\operatorname{Com}(\mathbf{a}' - \mathbf{a}) = 0$ , so we have proved that Pedersen commitments are binding.

On the other hand, Pedersen commitments as we’ve presented them don’t hide the committed vector. For example, imagine you commit to a vector and send the commitment to someone – but the other person knows that your vector has to be one of a short list of vectors. They can simply test those vectors one by one, computing $\operatorname{Com}(\mathbf{a})$ for each of them, until they find a commitment that matches.

One solution to this is to introduce a “blinding term” $a_{n+1} g_{n+1}$ , where $a_{n+1}$ is chosen at random. Instead of $\operatorname{Com}((a_1, \ldots, a_n))$ , you send $\operatorname{Com}((a_1, \ldots, a_n, a_{n+1}))$ – and to open the commitment, you reveal all of $a_1, \ldots, a_{n+1}$ .

The blinding term makes the “guess the committed vector” attack impossible. In fact, since $a_{n+1}$ is chosen uniformly at random, the commitment will be statistically equally likely to be any element of $G$ . So you have a strong statistical guarantee that the commitment leaks no information at all about the original vector $(a_1, \ldots, a_n)$ .

Proving you know a vector (Part 1: Hiding)

Let’s suppose a prover sent a commitment $g = \operatorname{Com}(\mathbf{a})$ to a verifier, but the verifier wants to know that this group element is a commitment to a vector. In other words, the prover has to prove that she knows $a_1, \ldots, a_n$ such that

$g = a_1 g_1 + \cdots + a_n g_n.$

(The basis elements $g_1, \ldots, g_n$ are publicly known.)

One option is for the prover to simply reveal the scalars $a_1, \ldots, a_n$ . This is unsatisfying, for two reasons. First, the prover doesn’t hide the vector from the verifier. And second, the protocol requires $O(n)$ proof length. We’ll see how to achieve a proof that is both hiding and succinct.

Let’s start with hiding. To hide $\mathbf{a}$ the prover simply chooses a second “blinding vector” $\mathbf{a}'$ and sends the commitment

$g' = a_1' g_1 + \cdots + a_n' g_n.$

The verifier responds with some random challenge $\lambda$ , and asks the prover to show that $\lambda g + g'$ is a known linear combination of $g_1, \ldots, g_n$ . The prover replies with the vector

$\lambda \mathbf{a} + \mathbf{a}' = (\lambda a_1 + a_1', \ldots, \lambda a_n + a_n'),$

and the verifier checks that indeed $\operatorname{Com}(\lambda \mathbf{a} + \mathbf{a}') = \lambda g + g'.$

Why does this work? Simple algebra shows that, if the prover is playing honestly, the verifier’s check will pass. But what if the prover is not? The intuitive idea is that, if the prover does not know an expansion for $g$ in terms of the vectors $g_1, \ldots, g_n$ , then she won’t be able to give an expansion for $\lambda g + g'$ – at least, not for most values of $\lambda$ . In fact, if the prover knows those expansions for even two values of $\lambda$ , say $\lambda_1$ and $\lambda_2$ , then the prover can compute an expansion for $g$ itself, since

$g = \frac{1}{\lambda_1 - \lambda_2} [ (\lambda_1 g + g') - (\lambda_2 g + g')].$

Here is another way to think about this situation, that will be useful for security proofs later on. Where did the prover get $g$ and $g'$ ? There are basically two ways to get group elements: either the prover computed them from some already-known elements (like $g_1, \ldots, g_n$ , or maybe others), or the prover chose them as new elements (e.g. at random), having no known relations with previously-chosen elements. In either case, let’s write

$\begin{aligned} g &= a_1 g_1 + \cdots + a_n g_n + b_1 h_1 + \cdots + b_m h_m \\ g' &= a_1' g_1 + \cdots + a_n' g_n + b_1' h_1 + \cdots + b_m' h_m \end{aligned}$

where $g_1, \ldots, g_n, h_1, \ldots, h_m$ are known group elements with no known relations between them. (If the prover chose $g$ at random, simply take $h_1 = g$ to be one of the independent group elements, and take $a_1 = \cdots = a_n = 0, b_1 = 1$ .)

Now the basic assumption is that the prover cannot find a nontrivial relation among the elements $g_1, \ldots, g_n, h_1, \ldots, h_m$ (except with very small probability of success). The verifier makes the challenge $\lambda$ , and the prover responds by revealing some $c_1, \ldots, c_n$ such that

$\lambda_g + g' = c_1 g_1 + \cdots + c_n g_n.$

This means that

$(\lambda a_1 + a_1') g_1 + \cdots + (\lambda a_n + a_n') g_n + (\lambda b_1 + b_1') h_1 + \cdots + (\lambda b_m + b_m') b_m = c_1 g_1 + \cdots + c_n g_n.$

At this point, either the prover has found a nontrivial relation, or all the coefficients on the $g_i$ ’s and $h_i$ ’s have to match. Since it’s assumed to be hard to find a relation, we can conclude that all the coefficient match. In particular,

$\lambda b_i + b_i' = 0$

for each $i$ .

But here’s the catch. The prover chose $b_i$ and $b_i'$ before the verifier chose $\lambda$ . If $b_i$ and $b_i'$ are not both zero, then only one value of $\lambda$ can possibly satisfy $\lambda b_i + b_i' = 0$ . The probability that the verifier happened to choose that particular $\lambda$ is vanishingly small. So we can conclude that all the $b_i$ ’s and $b_i'$ ’s are zero – in other words that

$g = a_1 g_1 + \cdots + a_n g_n.$

Proving you know a vector (Part 2: Succinctness)

We’ve seen how to prove we know a vector $(a_1, \ldots, a_n)$ whose commitment is $g$ , while keeping the vector hidden. Now let’s try to make the protocol more succinct.

In fact, once we apply the hiding protocol once, we can forget about the hiding issue entirely. If the prover wants to prove knowledge of a private vector $\mathbf{a}$ such that $\operatorname{Com}(\mathbf{a}) = g$ , the prover and the verifier simply run the protocol above (prover sends $g'$ , verifier sends $\lambda$ ) to reduce to the problem of proving knowledge of a vector whose commitment is $\lambda g + g'$ . The original $\mathbf{a}$ and $g$ have been replaced with $\lambda \mathbf{a} + \mathbf{a}'$ and $\lambda g + g'$ . There is no longer any need to keep them secret, since the random vector $\mathbf{a}'$ hides everything about $\mathbf{a}$ anyway.

So, we may as well rename $\lambda \mathbf{a} + \mathbf{a}'$ and $\lambda g + g'$ to $\mathbf{a}$ and $g$ again, and forget the hiding issue entirely.

Now let us suppose $n = 2m$ is even, and break down our vector $\mathbf{a}$ of length $n$ into two vectors

$\begin{aligned} a_L &= (a_1, \ldots, a_m) \\ a_R &= (a_{m+1}, \ldots, a_{2m}) \end{aligned}$

of length $m$ .

While we’re at it, let’s define

$g_L = (g_1, \ldots, g_m)$

and write

$a_L g_L = a_1 g_1 + \cdots + a_m g_m;$

define $g_R$ and $a_R g_R$ similarly.

The idea is to replace the vector $\mathbf{a} = (a_L, a_R)$ with a random linear combination $\lambda a_L + a_R$ , and the group elements $g_1, \ldots, g_n$ with a random linear combination $\mu g_L + g_R$ . The new commitment will then be

$(\lambda a_L + a_R) (\mu g_L + g_R) = \lambda \mu a_L g_L + \lambda a_L g_R + \mu a_R g_L + a_R g_R.$

To make the protocol work, the prover first sends the four group elements $a_L g_L, a_L g_R, a_R g_L, a_R g_R$ , and the verifier checks that $g = a_L g_L$ . Then the verifier sends random challenges $\lambda$ and $\mu$ . Finally, prover and verifier iterate the process, with

$\mathsf{com}_{\text{new}} = \lambda \mu a_L g_L + \lambda a_L g_R + \mu a_R g_L + a_R g_R$

and $\mu g_L + g_R$ in place of $\mathsf{com}_{\text{old}} = \mathbf{a} g$ and $g_1, \ldots, g_n$ .

Proving security

Why is this sound? The idea of the proof is the same as before: Suppose the prover sends

$\begin{aligned} "a_L g_L" &= b_1 g_L + b_2 g_R \\ "a_L g_R" &= b_3 g_L + b_4 g_R \\ "a_R g_L" &= b_5 g_L + b_6 g_R \\ "a_R g_R" &= b_7 g_L + b_8 g_R. \end{aligned}$

If the protocol passes, then the group element

$\mathsf{com}_{\text{new}} = (\lambda \mu b_1 + \lambda b_3 + \mu b_5 + b_7) g_L + (\lambda \mu b_2 + \lambda b_4 + \mu b_6 + b_8) g_R$

must be a combination of the $m$ vectors $\mu g_L + g_R$ . Since discrete logarithm is hard, the prover can’t come up with two expressions for the same group element in terms of $g_L$ and $g_R$ – so the $g_L$ -coefficient must be $\mu$ times the $g_R$ coefficient. That is,

$\lambda \mu b_1 + \lambda b_3 + \mu b_5 + b_7 = \mu (\lambda \mu b_2 + \lambda b_4 + \mu b_6 + b_8).$

This polynomial equality holds for randomly chosen $\lambda$ and $\mu$ , so (unless the verifier’s random challenge was very unlucky) the two sides are actually the same polynomial of $\lambda$ and $\mu$ . In other words, we can equate coefficients, and obtain: $b_2 = b_6 = b_3 = b_7 = 0$ , $b_1 = b_4,$ $b_5 = b_8.$

In other words: The thing the prover claimed was $a_L g_L$ was actually $b_1 g_L$ , for some $b_1$ known to the prover – and similarly for the other three claims. So (if we rename $a_L = b_1$ and $a_R = b_5$ ) the prover proved that she knows $a_L$ and $a_R$ such that $\mathsf{com}_{\text{old}} = a_L g_L + a_R g_R$ .

Making the protocol more efficient

It turns out that the verifier doesn’t have to send two independent challenges $\lambda$ and $\mu$ . Instead, he can take $\mu = \lambda^{-1}$ , so that

$(\lambda a_L + a_R) (\lambda^{-1} g_L + g_R) = (a_L g_L + a_R g_R) + \lambda a_L g_R + \lambda^{-1} a_R g_L.$

Since $(a_L g_L + a_R g_R) = \mathsf{com}_{\text{old}}$ is already known, this means the prover only has to send the two commitments $a_L g_R$ and $a_R g_L$ before getting back the challenge $\lambda$ from the verifier.

Inner product argument

Finally, let’s come back to the inner product argument. Suppose the prover wants to prove that some commitment (group element) $C$ is of the form

$C = \mathbf{a} g + \mathbf{b} h + (\mathbf{a} \cdot \mathbf{b}) q.$

(Here $\mathbf{a}$ and $\mathbf{b}$ are vectors of length $n$ . The symbol $g$ is short for $g_1, \ldots, g_n$ , an $n$ -tuple of group elements, and $\mathbf{a} g = a_1 g_1 + \cdots + a_n g_n$ . Similarly for $\mathbf{b} h$ . The element $q$ is a single group element.)

The prover is going to replace both the vectors $\mathbf{a}$ and $\mathbf{b}$ and the basis group elements $g$ and $h$ with random linear combinations. As above, break the vector $\mathbf{a}$ into two vectors $\mathbf{a}_L$ and $\mathbf{a}_R$ , each of half the length, so $\mathbf{a} = (\mathbf{a}_L, \mathbf{a}_R)$ . Similarly for $\mathbf{b}$ , $g$ , $h$ .

The naive generalization of the above protocol would be for the verifier to choose four random challenges $\lambda, \mu, \nu, \pi$ , and then to replace $\mathbf{a}$ , $\mathbf{b}$ , $g$ , $h$ with the random linear combinations

$\begin{aligned} \mathbf{a}_{\text{new}} &= \lambda \mathbf{a}_L + \mathbf{a}_R \\ \mathbf{b}_{\text{new}} &= \mu \mathbf{b}_L + \mathbf{b}_R \\ g_{\text{new}} &= \nu g_L + g_R \\ h_{\text{new}} &= \pi h_L + h_R. \end{aligned}$

Before the verifier sends the challenges, the prover would have to send enough data to enable the verifier to compute the new claimed commitment

$C_{\text{new}} = \mathbf{a}_{\text{new}} g_{\text{new}} + \mathbf{b}_{\text{new}} h_{\text{new}} + (\mathbf{a}_{\text{new}} \cdot \mathbf{b}_{\text{new}}) q.$

Since the expression for $C$ contains product terms $\mathbf{a} g, \mathbf{b} h, \mathbf{a} \cdot \mathbf{b}$ , the prover would have to commit to all the cross-terms:

$\begin{aligned} \mathbf{a}_L g_L, \mathbf{a}_L g_R, \mathbf{a}_R g_L, \mathbf{a}_R g_R \\ \mathbf{b}_L h_L, \mathbf{b}_L h_R, \mathbf{b}_R h_L, \mathbf{a}_b h_R \\ (\mathbf{a}_L \cdot \mathbf{b}_L) q, (\mathbf{a}_L \cdot \mathbf{b}_R) q,(\mathbf{a}_R \cdot \mathbf{b}_L) q, (\mathbf{a}_R \cdot \mathbf{b}_R) q. \end{aligned}$

Then the verifier would send the four challenges $\lambda$ , $\mu$ , $\nu$ , $\pi$ , and they would repeat the protocol on the vectors $\mathbf{a}_{\text{new}}$ and $\mathbf{b}_{\text{new}}$ of half the length.

It turns out we can make the protocol much more efficient by taking

$\pi = \lambda$

and

$\mu = \nu = \lambda^{-1}.$

Again, this leads to a ton of cancellations. It turns out that instead of the 12 cross-terms shown, the prover can get away with sending just two group elements (the $\lambda$ and $\lambda^{-1}$ -terms when $C_{\text{new}}$ is expanded in terms of $\lambda$ ).

Of course, one still has to show that this protocol is secure against cheating provers. This is done by the same sort of calculation we did above – if you want to see the details, check out Dankrad’s post.

Keys	Action
`?`	Open this help
`n`	Next page
`p`	Previous page
`s`	Search