Proposal: Attributable Consensus Solution for DV Clusters – Appendix

Albert Garreta, Ignacio Manzur, and Aikaterini Panagiota-Stouka on behalf of Nethermind Research.

Introduction and overview

This is the last post in our series on Attributable Consensus (see Post 1, Post 2, and Post 3). This write-up should be taken as a collection of appendices to the solutions presented previously. In particular, we will:

  1. Go into full detail on verifying the realness of some duties claimed by operators.
  2. Provide recommendations as to which consensus protocols and signature schemes to use.
  3. Explain an extension to our whistleblower mechanism that allows more than one whistleblower to submit complaints for the same cluster at the end of a c-epoch. This extension addresses some edge-case scenarios in the method presented in Post 3.

Verifying the reality of duties

Here, we propose an approach that combines the verification of proofs that duties were performed (for block proposal and sync committee duties) while accepting the theoretical maximum number of attestations (#validators x #epochs) without verifying. The whistleblower complains about the reality of these duties, and the cluster is expected to provide a proof.

Getting the beacon block root to the L2 SC

The first step to prove statements about specific duties is to get the relevant block header or beacon block root to the L2 SC. Assuming EIP-4788, the L1 SC could obtain the appropriate beacon block root and send it to L2 via L1-L2 messaging.

There is a subtlety. The EIP only expects blocks in the last 8191 slots (~27h) to be available in the EVM. So, if we want to get an older block’s beacon root, we need to perform an additional step. Every beacon block contains the state root, which is the hash tree root of the beacon chain state container. The state container is as follows:

class BeaconState(Container):
# Versioning
genesis_time: uint64
genesis_validators_root: Root
slot: Slot
fork: Fork
# History
latest_block_header: BeaconBlockHeader
block_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
state_roots: Vector[Root, SLOTS_PER_HISTORICAL_ROOT]
historical_roots: List[Root, HISTORICAL_ROOTS_LIMIT]
# Eth1
eth1_data: Eth1Data
eth1_data_votes: List[Eth1Data, EPOCHS_PER_ETH1_VOTING_PERIOD * SLOTS_PER_EPOCH]
eth1_deposit_index: uint64
# Registry
validators: List[Validator, VALIDATOR_REGISTRY_LIMIT]
balances: List[Gwei, VALIDATOR_REGISTRY_LIMIT]
# Randomness
randao_mixes: Vector[Bytes32, EPOCHS_PER_HISTORICAL_VECTOR]
# Slashings
slashings: Vector[Gwei, EPOCHS_PER_SLASHINGS_VECTOR]  # Per-epoch sums of slashed effective balances
# Attestations
previous_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
current_epoch_attestations: List[PendingAttestation, MAX_ATTESTATIONS * SLOTS_PER_EPOCH]
# Finality
justification_bits: Bitvector[JUSTIFICATION_BITS_LENGTH]  # Bit set for every recent justified epoch
previous_justified_checkpoint: Checkpoint  # Previous epoch snapshot
current_justified_checkpoint: Checkpoint
finalized_checkpoint: Checkpoint

It is the historical_roots field that is relevant to this problem. As mentioned in the eth2 book, this field “allows Merkle proofs to be constructed about anything in the beacon chain’s history if required.” This is because each time the block_roots and state_roots vectors are full (this also happens every 8191 slots or every ~27h), they are Merkleized (separately), and their roots are added to the historical_roots field. So, to get a beacon block root r in the EVM, we have three possibilities:

  1. The block has happened in the last 8191 slots. In this case, we can use the EIP directly.
  2. The block has happened between the last 16382 and 8191 slots. In this case, r is still in the block_roots vector. We can pull the oldest possible block root r’ (one that happened 27 hours ago) with the EIP. Then, we prove that there is a Merkle path from r in block_roots to r’. We must provide the remaining data necessary to verify the Merkle proofs to the SC; this means a hash of all containers in BeaconState and BeaconBlock.
  3. The block has happened more than 16382 slots ago. We can pull the oldest possible block root r’ (one that occurred 27 hours ago) with the EIP. Then we need to use a Merkle membership proof that shows that r is in a Merkle trie whose root is in historical_roots. We must also prove (this is a continuation of the previous Merkle membership proof) that historical_roots is part of a Merkle trie with root r’. We must provide the remaining data necessary to verify the Merkle proofs to the SC; this means a hash of all containers in BeaconState and BeaconBlock.

With the relevant beacon block root in the L2 smart contract, it remains to use Merkle-type proofs to produce proofs that duty data is in the beacon block.

Block proposal

The slot where the block proposal was supposed to be performed is in the optimistic participation lists. Once we know the slot, we can get the relevant block root to the L2 SC.

The block proposer information can be found in BeaconBlock>proposer_index: ValidatorIndex.

Note that this is not the validator’s public key. If we keep the record of indexes of our validators as in the beacon state validator registry we are done. Otherwise, we need to get the public keys of the validators from the validator registry. Obol could, when deploying the L2 SC associated with a cluster, hardcode the indices of all validators controlled by the cluster and update this information if necessary.

Suppose that the L2 SC has a record of the indices of all the validators of the cluster. The whistleblower that wants to claim that there is a problem with the block proposal at a specific slot j_{2,\ell} for list L starting at block s should proceed as follows:

  • the whistleblower should get the block root r for block j_{2,\ell} in the L2 SC
  • The cluster submits the relevant list L such that L.start=s
  • the L2 SC should parse L and identify the slot j_{2, \ell}. If it cannot be found, the whistleblower loses the claim.
  • The cluster has time to submit a Merkle proof (against the root r) that certifies that proposer_index is idx. If it fails, the whistleblower wins the claim.
  • The L2 SC checks the Merkle proof. It checks that idx is the index of some validator controlled by the cluster. If any of the checks fail, the whistleblower wins the claim.

Sync committee duties

The BeaconBlockBody has the container SyncAggregate, which consists of the following:

class SyncAggregate(Container):
sync_committee_bits: Bitvector[SYNC_COMMITTEE_SIZE]
sync_committee_signature: BLSSignature

We need to know the current sync committee to determine what index (or indices) corresponds to the cluster’s validator(s) in the committee. We can use methods similar to the sync committee rotation proof in zk light clients.

This is done by first agreeing on a trusted sync committee (hardcoding it in the zk light client contract) and starting its duties at some block. To determine the following sync committee, the current sync committee signs a message containing the validators’ public keys in the following committee. The Operator should prove, successively, that all the sync rotation messages have been signed by a supermajority of the sync committee until the relevant block. This allows it to show what the current sync committee is. Once the rotation to a new sync committee has been proved, the contract can update the last proved committee. The next prover can use this committee as the first committee in its rotation proof, and so on.

Once we know the sync committee rotation message for the relevant block, we can check that the bit corresponding to the validator indices is set to 1, with a Merkle proof. Say the indices are i_1, \dots, i_s, then the cluster should provide a proof that:

\mathsf{sync\_committee\_bits}=[w_1,\underset{(i_1)}{1}, w_2, \underset{(i_2)}{1}, w_3, \dots, w_s,\underset{(i_s)}{1}, w{s+1}]

where w_i are bit vectors of the correct length for the indices (potentially empty) by using a Merkle proof to the beacon block root.

The verification procedure is almost the same as for the block proposal.

Attestations

\#\text{validators}\times \#\text{epochs}

The number of epochs is computed as:

\lceil\frac{t_{L2}}{t_e}\rceil

where t_{L2} is the time that passed during the optimistic participation proofs’ period of activity for the cluster (As computed by L2 blocks, if the L2 block time is variable we should use an upper bound. Typically, Starknet is moving towards constant block time, and Arbitrum’s block time is consistently around 0.26±0.03 seconds); t_e is the length of an Ethereum epoch (384 seconds).

Consensus protocols and threshold signatures

Consensus protocol

We recommend using HotStuff as consensus protocol for both the fast track and the consensus protocol used when creating optimistic Participation Proofs. There are several benefits of using HotStuff:

  1. It is a well-studied BFT machine replication protocol; see, e.g., https://dl.acm.org/doi/pdf/10.1145/3584684.3597268 for a recent overview of HotStuff.
  2. It can attain strong forensic support, as explained in BFT Protocol Forensics (forensic support refers to identifying operators whose signatures caused a fork in the consensus).
  3. While executing the protocol, the leader in a view is tasked with aggregating other operators’ signatures into a threshold signature. This fits nicely with our optimistic Participation Proof solution, where clusters need to produce a threshold signature on (the hash of) their optimistic Participation Proofs.

Remark: An alternative to HotStuff can be the recent HotStuff-2 protocol. This protocol is essentially HotStuff but with one less round of interaction. Consequently, it is a more efficient protocol. A downside is that, due to its novelty, HotStuff-2 is less battle-tested, and there are fewer available reliable implementations of it, if any.

Usage specifics

The cluster uses HotStuff for two different purposes:

  • Fast path: The cluster wants to reach a DV consensus on what the cluster validators should sign, i.e., it has the same role as the QBFT protocol in the current Obol design.
  • The HotStuff protocol used for this purpose will run continuously. It should be as light as possible to favor a low latency.
  • Creation of optimistic Participation Proofs: A second instance of HotStuff is run at the end of a c-slot to agree on the hash of an optimistic participation proof Hash_{L2}(L,\mathcal{S}) (cf. here).
  • There is no need for high speed in this case. The cluster has \Delta_{update} time (measured in L2 blocks) to agree on such a proof and submit it to the L2 SC. The parameter \Delta_{update} does not need to be particularly small. Setting \Delta_{update} to last about 0.5-1h is more than reasonable.
  • Once an operator has confirmation that Hash_{L2}(L,\mathcal{S}) has been agreed upon by a majority of operators, the operator engages with the rest of the cluster on a Threshold Signature Scheme. The goal of this step is to produce a threshold signature TS on Hash_{L2}(L,\mathcal{S}).

Threshold signatures

Threshold signing of signing Hash_{L2}(L,\mathcal{S})

We recommend using a Schnorr-based threshold signature scheme when signing Hash_{L2}(L,\mathcal{S}) at the last step above. This choice is motivated by the fact that the scheme does not require computing pairings on the L2, which is something that, for zk-rollups, is prohibitively expensive at the moment.

Threshold signing of blob data B[C]

To threshold-sign the data B[C], the cluster uses an aggregate BLS signature and a bitfield, which describes whose signatures were used for aggregation. This choice is motivated by the fact that the signature is verified on Ethereum (only if a whistleblower complains). Using a BLS-based signature allows the use of precompiles, making the cost of verifying the threshold signature between 150k-200k gas.

As with Hash_{L2}(L,\mathcal{S}), to agree on the exact blob data B[C] the cluster needs to sign, they first use the HotStuff protocol.

Multiple whistleblower claims

In Post 3, we described a penalty mechanism for cluster nodes caught by a whistleblower. The mechanism is carefully designed so that the cluster nodes have incentives to report their tasks/points truthfully. To keep the construction as simple as possible, we allowed only one whistleblower who raises a correct alarm per accusation period to make a report. Here, we discuss what advantages (if any) we could gain by allowing multiple whistleblowers to raise an alarm.

Assume (i) that a cluster node who submitted incorrect tasks/points and decided to pretend also the whistleblower cannot be sure that its transaction will be included in the blockchain before any other whistleblower’s transaction; (ii) a whistleblower that decides to raise a truthful alarm reports all the inconsistencies. Then, allowing for multiple whistleblower claims does not seem to offer any advantage. Hence, we can ignore any whistleblower who raises an alarm after a whistleblower has already correctly done so.

An advantage could appear if we remove the above assumptions. In more detail, if a whistleblower submits a correct alarm but does not report all the inconsistencies, then a second whistleblower could report the remaining inconsistencies. In that case, the cluster points would also be updated to consider the data submitted by the last whistleblower. This means that even if a malicious cluster node submits incorrect tasks/points and later becomes the first whistleblower reporting only a subset of the inconsistencies, then a following whistleblower can still submit a new alarm and mention the unreported inconsistencies.

Note that, although allowing for more whistleblowers can be helpful to capture the above corner cases, these cases can also be prevented by just making f_{Obol}-the fraction of the cluster rewards that is removed from the cluster and is kept by Obol if a whistleblower raises a correct alarm- very high (but still leaving enough rewards for the whistleblower). This holds because even if the first whistleblower does not report all the consistencies (which means that the cluster points will not be completely accurate), if f_{obol} is high enough, the cluster reward loss will exceed the potential gain of the cluster nodes because of the unreported inconsistencies in points.

2 Likes