Coding for Ethereum block propagation (proposed by Arantxa)
Can you improve the efficiency of this sharding protocol that was proposed for the Ethereum blockchain? Specifically, can you decrease the amount of data that must be sent for coefficients and commitments?
Here is a summary of the importart parts of the protocol.
A “proposer” has a message (an Ethereum block) of size ~100 KB; it wants to send the message in distributed fashion over a network to ~50,000 nodes. To do this:
- The message is split into ten equal-sized “chunks” , each of which is interpreted as a vector over . (Here , so each chunk is a vector of length ~300.)
The proposer sends messages (“shards” of the block) to some nodes as follows:
- For each node, the proposer chooses ten random “coefficients” in : .
- The proposer sends:
- The linear combination
- The coefficients
- Ten Pedersen commitments to the vectors , and
- A signature on the concatenation of the ten Pedersen commitments.
Note that the full block can be recovered (with very high probability) from any 10 such shards.
Each node waits until it has received some number of shards, (from the proposer or other nodes). (Maybe .) Then:
- The node chooses coefficients .
- The node computes the “linear combination of shards” , and sends it to other nodes. This “linear combination” is a package of:
- The combined vector – here
- The combined coefficients
- The same ten Pedersen commitments to the vectors , and
- A signature on the concatenation of the ten Pedersen commitments.
Why this protocol?
Two questions for Arantxa:
Why shard at all, rather then sending the full block?
Why not make the node wait for messages to come in?
Why use random linear combinations, rather than just sending each node some randomly chosen chunk ? With linear combinations, any node that receives ten random “shards” will be able to recover the full block (with high probability).
Why send commitments to the vectors ? This is an integrity mechanism: since Pedersen commitments are linear, a node can verify the correctness of any purported linear combination of the ’s. If a node does not verify correctness, errors will propagate across the network.
Things we want: decrease the communication cost.
-
The coefficients require 10 field elements, or about 320 bytes. Can you decrease this?
-
The commitments to the vectors require 10 Pedersen commitments, or again about 320 bytes. Can you decrease this?