The sum-check protocol
(Ingredient 1 of 5 in the PCP overview.)
Before worrying about systems of equations, let’s imagine we just have a single sum
for some values and , and assume further that is not too small.
Imagine a prover Peggy knows all the values , and is asserting to the verifier Victor that the ’s sum to . Victor wants to know that Peggy computed the sum correctly, but Victor doesn’t want to actually read all the values of .
Well, at face value, this is an obviously impossible task. Even if Victor knew all but one of Peggy’s ’s, that wouldn’t be good enough.
So to get anywhere, we need to give Victor at least one magic power.
An oracle to a multilinear polynomial
Assume for convenience that the number of ’s happens to be and change notation to a function , so our equation becomes
In other words, we have changed notation so that our variables are indexed over a hypercube: from to .
Now here’s the magic power we’re granting. By polynomial interpolation, no matter what function we had initially, we can view it as multilinear polynomial . For example, suppose and the eight (arbitrary) variable values were given
(So .) Then we’d be trying to fill in the blanks in the equation
so that agrees with on the cube. This comes down to solving a system of linear equations; in this case it turns out that works, and I’ve cherry-picked the numbers so a lot of the coefficients work out to for convenience, but math majors should be able to verify that exists and is unique no matter what eight initial numbers I would have picked (by induction on ).
Now here’s the magic power: we are going to let Victor make one call to a magic oracle that can tell Victor the value of , for his choice of . Note importantly that the ’s do not have to be /, in fact we will say Victor just chooses them randomly from the much larger . But he can only ask the oracle for that single value of , and otherwise has no idea what any of the ’s are. The punch line of the protocol is that this single oracle call is good enough. If Victor has this oracle, he only needs to read one value for Peggy to convince him that was computed correctly.
A playthrough of the sum-check protocol
Let’s use the example above with : Peggy has chosen those eight values with , and wants to convince Victor without actually sending all eight values. Peggy has done her homework and computed the coefficients of as well (after all, she chose the values of ), so Peggy can evaluate anywhere she wants. But Victor can only ask the oracle about a single value of the polynomial on a point (probably) outside the hypercube.
Here’s how they do it. (All the information sent by Peggy to Victor is boxed.)
-
Peggy announces her claim .
-
They now discuss the first coordinate:
-
Victor asks Peggy to evaluate the linear one-variable polynomial
and send the result. In our example, it equals
- Victor then checks that this is consistent with the claim ; it should satisfy by definition. Indeed, .
- Finally, Victor commits to a random choice of ; let’s say . From now on, he’ll always use for the first argument to .
-
-
With the first coordinate fixed at , they talk about the second coordinate:
-
Victor asks Peggy to evaluate the linear polynomial
and send the result. In our example, it equals
- Victor makes sure the claimed is consistent with ; it should satisfy . Indeed, it does: .
- Finally, Victor commits to a random choice of ; let’s say . From now on, he’ll always use for the second argument to .
-
-
They now settle the last coordinate:
-
Victor asks Peggy to evaluate the linear polynomial
and send the result. In our example, it equals
- Victor makes sure the claimed is consistent with ; it should satisfy . Indeed, it does .
- Finally, Victor commits to a random choice of ; let’s say .
-
-
Victor has picked all three coordinates, and is ready to consult the oracle. He gets . This matches , and the protocol ends.
General procedure
The previous transcript should generalize obviously to any , but we spell it out anyways. Peggy has already announced and pre-computed . Now for ,
- Victor asks Peggy to compute the univariate polynomial corresponding to partial sum, where the th parameter is a free parameter while all the , \dots, have been fixed already.
- Victor sanity-checks each of Peggy’s answers by making sure is consistent with (that is, , or for the edge case that ).
- Then Victor chooses a random and moves on to the next coordinate.
Once Victor has decided on every , he asks the oracle for and makes sure that it matches the value of . If so, Victor believes Peggy.
Up until now, we wrote the sum-check protocol as a sum of a multilinear polynomial over .
There is nothing special about multilinear polynomials. It would work equally well to work with a polynomial of some small degree in each variable. The only change is that Peggy’s claimed polynomials would have degree rather than degree . Everything else stays the same. This will be important for some applications of the sum check protocol.
In fact, there is also nothing special about and it would work equally well with for any small finite set ; then instead of the multilinear extension, you would work with polynomials that have degree in each variable.
Soundness
Why is the sum-check protocol sound? In other words, what goes wrong if Peggy tries to convince Victor of a false claim?
Let’s suppose Peggy claims a false value for the whole sum . When it comes to the first step , if she reveals the true sum (as a polynomial in ), the consistency check won’t pass – and Victor won’t accept. So Peggy is forced to give a false polynomial as well.
Now and are two different linear polynomials, so there is at most one value of where . If Peggy is very lucky, Victor’s random choice of will happen to be that one value – and Peggy can continue honestly from here on out.
Most likely, that won’t happen. The function will have the wrong value at Victor’s random , so Peggy is still stuck trying to prove a false claim, now about a sum over variables.
Now we come to , and Peggy has the same dilemma. She can’t give the true , because the check wouldn’t pass. So she has to give a false . Again, unless Victor’s choice of is very lucky (from Peggy’s point of view), the value of will again be wrong.
After steps, Victor will consult the multilinear oracle, and discover that , and the check will fail – unless, by some very bad luck, he happened to randomly choose the one bad value for some .
There are possible values for each , so at any step, Victor’s chance of choosing the bad value is at most . Summing it up, since Victor makes choices, the chance that a cheating Peggy can sneak a false proof past Victor is at most .
Assuming is very large, this probability is very small. So if the protocol passes, Victor can be confident that Peggy’s calculation was honest.