Keccak Channel & Proof Serialization

Custom Fiat-Shamir channel design and deterministic proof encoding between Rust and Solidity

Overview

The Fiat-Shamir channel transforms an interactive proof protocol into a non-interactive one by deriving all verifier challenges from the proof transcript. This page covers:

The design rationale for replacing Stwo's Blake2s with Keccak-256
The Rust-side KeccakMerkleChannel implementation
The Solidity-side KeccakChannel library
The deterministic proof serialization format
Transcript replay and consistency verification

1. Why Keccak-256?

The Blake2s Problem

Stwo uses Blake2s-256 for both Merkle hashing and Fiat-Shamir. Ethereum's precompile situation:

Hash	EVM Support	Gas (approx)
Keccak-256	Native opcode	~30 per 32 bytes
SHA-256	Precompile (0x02)	~60 + 12/word
Blake2b	Precompile (0x09, EIP-152)	~12 per round
Blake2s	None	~3,000–5,000 pure Solidity

Blake2s and Blake2b are different algorithms with different round functions and word sizes (32-bit vs 64-bit). The EIP-152 precompile gives Blake2b rounds, which cannot be used for Blake2s.

A pure-Solidity Blake2s would process 64 bytes per compression at ~3,000+ gas. Our verifier performs hundreds of Merkle hash operations across FRI layers, so this would add millions of gas.

The Solution

Stwo's Channel and MerkleChannel traits are generic over the hash function. We implemented a custom KeccakMerkleChannel in Rust that uses Keccak-256 (SHA-3 family) for all hash operations.

The Solidity verifier then uses the native keccak256 opcode — the same hash function at ~100× lower cost.

The channel is a Fiat-Shamir transform: it converts interactive oracle proofs to non-interactive proofs by replacing verifier randomness with hash outputs. The security reduction requires only that the hash function is collision-resistant and behaves as a random oracle.

Keccak-256 provides 128-bit collision resistance (same as Blake2s-256's 128-bit collision resistance). Replacing Blake2s with Keccak-256 does not weaken the STARK's soundness.

2. Rust Implementation: KeccakMerkleChannel

The custom channel (keccak_channel.rs, ~470 lines) implements Stwo's traits:

Core Types

/// 32-byte Keccak hash output
#[repr(C, align(32))]
pub struct KeccakHash(pub [u8; 32]);

/// Keccak-256 Merkle hasher
pub struct KeccakHasher;

/// Keccak-256 Fiat-Shamir channel
pub struct KeccakChannel {
    digest: KeccakHash,
    channel_time: ChannelTime,
}

/// MerkleChannel adapter
pub struct KeccakMerkleChannel;

Channel Trait Implementation

impl Channel for KeccakChannel {
    type Digest = KeccakHash;

    fn mix_digest(&mut self, digest: Self::Digest) {
        let mut hasher = sha3::Keccak256::new();
        hasher.update(&self.digest.0);
        hasher.update(&digest.0);
        self.digest = KeccakHash(hasher.finalize().into());
        self.channel_time.inc_sent();
    }

    fn mix_u32s(&mut self, data: &[u32]) {
        let mut hasher = sha3::Keccak256::new();
        hasher.update(&self.digest.0);
        for &val in data {
            hasher.update(&val.to_le_bytes());
        }
        self.digest = KeccakHash(hasher.finalize().into());
        self.channel_time.inc_sent();
    }

    fn draw_random_bytes(&mut self) -> [u8; 32] {
        let mut hasher = sha3::Keccak256::new();
        hasher.update(&self.digest.0);
        hasher.update(&(self.channel_time.n_draws as u32).to_le_bytes());
        hasher.update(&[0u8]);
        self.channel_time.inc_draw();
        hasher.finalize().into()
    }
}

Merkle Hasher Implementation

The Merkle hasher follows Stwo's column-major leaf hashing:

impl MerkleHasherLifted<CpuBackend> for KeccakHasher {
    fn hash_node(children_hashes: (&KeccakHash, &KeccakHash)) -> KeccakHash {
        let mut hasher = sha3::Keccak256::new();
        hasher.update(&children_hashes.0 .0);
        hasher.update(&children_hashes.1 .0);
        KeccakHash(hasher.finalize().into())
    }
}

Leaf hashing encodes each M31 value as 4 little-endian bytes, concatenated across columns:

leaf = keccak256(col0_LE4 ‖ col1_LE4 ‖ ... ‖ colN_LE4)

Proof of Work

The PoW implementation matches Stwo's GrindOps:

fn grind(&self, channel: &KeccakChannel, pow_bits: u32) -> u64 {
    let seed = Self::get_grinding_seed(channel, pow_bits);
    (0u64..)
        .find(|&nonce| {
            let hash = Self::hash_with_nonce(&seed, nonce);
            Self::trailing_zeros(&hash) >= pow_bits
        })
        .unwrap()
}

The seed is keccak256(POW_PREFIX_LE ‖ [0; 12] ‖ digest ‖ pow_bits_LE) where POW_PREFIX = 0x12345678.

3. Solidity Implementation: KeccakChannel

The Solidity channel (KeccakChannel.sol, 213 lines) must reproduce the Rust behavior exactly.

State

struct State {
    bytes32 digest;   // 32-byte Keccak state
    uint32 nDraws;    // draw counter (reset on mix)
}

Little-Endian Encoding

All integers are encoded as little-endian bytes. Solidity is big-endian natively, so we need explicit byte swapping:

function _toLEBytes4(uint32 val) private pure returns (bytes4) {
    return bytes4(uint32(
        ((val & 0xFF) << 24) |
        (((val >> 8) & 0xFF) << 16) |
        (((val >> 16) & 0xFF) << 8) |
        ((val >> 24) & 0xFF)
    ));
}

Mix Operations

mixRoot: Concatenates digest and root, hashes with keccak256.

function mixRoot(State memory state, bytes32 root) internal pure {
    state.digest = keccak256(abi.encodePacked(state.digest, root));
    state.nDraws = 0;
}

mixFeltsFlat: Used for sampled OOD values. Each M31 is encoded as 4 LE bytes:

function mixFeltsFlat(State memory state, uint32[] memory m31Values) internal pure {
    bytes memory payload = abi.encodePacked(state.digest);
    for (uint256 i = 0; i < m31Values.length; i++) {
        payload = abi.encodePacked(payload, _toLEBytes4(m31Values[i]));
    }
    state.digest = keccak256(payload);
    state.nDraws = 0;
}

Draw Operations

drawU32s: Produces 8 pseudorandom uint32 values per call:

function drawU32s(State memory state) internal pure returns (uint32[8] memory result) {
    bytes32 hash = keccak256(abi.encodePacked(
        state.digest,
        _toLEBytes4(state.nDraws),
        bytes1(0x00)
    ));
    state.nDraws += 1;

    // Parse 32 bytes as 8 little-endian u32s
    for (uint256 i = 0; i < 8; i++) {
        uint256 offset = i * 4;
        result[i] = uint32(
            uint256(uint8(hash[offset])) |
            (uint256(uint8(hash[offset + 1])) << 8) |
            (uint256(uint8(hash[offset + 2])) << 16) |
            (uint256(uint8(hash[offset + 3])) << 24)
        );
    }
}

drawSecureFelt: Rejection sampling for QM31 challenges:

function drawSecureFelt(State memory state) internal pure returns (uint128) {
    while (true) {
        uint32[8] memory u32s = drawU32s(state);
        bool valid = true;
        for (uint256 i = 0; i < 8; i++) {
            if (uint256(u32s[i]) >= 2 * P) {
                valid = false;
                break;
            }
        }
        if (!valid) continue;

        return qm31FromM31(
            _reduceToM31(u32s[0]),
            _reduceToM31(u32s[1]),
            _reduceToM31(u32s[2]),
            _reduceToM31(u32s[3])
        );
    }
}

The rejection threshold is $2p = 2^{32} - 2$ . Values in $[0, 2p)$ are reduced by subtracting $p$ if $\geq p$ . The retry probability per draw is $\sim 2/2^{32}$ per element, or $\sim 8 \times 2^{-31} \approx 2^{-28}$ per batch of 8. In practice, retries almost never occur.

Proof of Work Verification

function verifyPowNonce(State memory state, uint32 nBits, uint64 nonce)
    internal pure returns (bool)
{
    // Compute seed: H(POW_PREFIX_LE, [0;12], digest, nBits_LE)
    bytes32 prefixedDigest = keccak256(abi.encodePacked(
        _toLEBytes4(0x12345678),
        bytes12(0),
        state.digest,
        _toLEBytes4(nBits)
    ));

    // Compute H(seed, nonce_LE)
    bytes32 res = keccak256(abi.encodePacked(
        prefixedDigest,
        _toLEBytes8(nonce)
    ));

    // Count trailing zeros in LE u128 interpretation
    uint128 val;
    for (uint256 i = 0; i < 16; i++) {
        val |= uint128(uint8(res[i])) << uint128(i * 8);
    }
    // ... count trailing zeros of val, check >= nBits
}

The trailing zeros check interprets the first 16 bytes of the hash as a little-endian u128 and counts trailing zero bits.

4. Proof Serialization Format

Design Principles

Deterministic: Same proof always produces identical bytes
Sequential: Read from start to end with no seeking
Little-endian: All integers encoded LE (matching Rust conventions)
No padding: Values packed tightly
Self-describing: Length prefixes for variable-length sections

Format Specification

┌─────────────────────────────────────────────────┐
│ COMMITMENTS                                      │
│   n_trees: u32                                   │
│   per tree: hash (32 bytes)                      │
├─────────────────────────────────────────────────┤
│ SAMPLED VALUES                                   │
│   n_trees: u32                                   │
│   per tree:                                      │
│     n_cols: u32                                  │
│     per col:                                     │
│       n_samples: u32                             │
│       per sample: qm31 (4 × m31_LE4 = 16 bytes) │
├─────────────────────────────────────────────────┤
│ DECOMMITMENTS                                    │
│   n_trees: u32                                   │
│   per tree:                                      │
│     n_hashes: u32                                │
│     per hash: 32 bytes                           │
├─────────────────────────────────────────────────┤
│ QUERIED VALUES                                   │
│   n_trees: u32                                   │
│   per tree:                                      │
│     n_cols: u32                                  │
│     per col:                                     │
│       n_vals: u32                                │
│       per val: m31 (4 bytes LE)                  │
├─────────────────────────────────────────────────┤
│ PROOF OF WORK                                    │
│   nonce: u64 (8 bytes LE)                        │
├─────────────────────────────────────────────────┤
│ FRI PROOF                                        │
│   First layer:                                   │
│     commitment: 32 bytes                         │
│     n_witness: u32                               │
│     per witness: qm31 (16 bytes)                 │
│     n_decomm: u32                                │
│     per decomm: 32 bytes                         │
│   Inner layers:                                  │
│     n_layers: u32                                │
│     per layer: same format as first layer        │
│   Last layer:                                    │
│     n_coeffs: u32                                │
│     per coeff: qm31 (16 bytes)                   │
└─────────────────────────────────────────────────┘

Serialization Code (Rust)

Key serialization functions:

fn write_qm31(buf: &mut Vec<u8>, val: SecureField) {
    let arr = val.to_m31_array();
    for m31 in arr {
        buf.extend_from_slice(&m31.0.to_le_bytes());
    }
}

fn write_hash(buf: &mut Vec<u8>, hash: &KeccakHash) {
    buf.extend_from_slice(&hash.0);
}

fn write_u32(buf: &mut Vec<u8>, val: u32) {
    buf.extend_from_slice(&val.to_le_bytes());
}

Parsing Code (Solidity)

The verifier reads the proof as a sequential byte stream:

struct ProofReader {
    bytes data;
    uint256 offset;
}

function _readU32(ProofReader memory r) internal pure returns (uint32) {
    uint32 val = uint32(uint8(r.data[r.offset]))
        | (uint32(uint8(r.data[r.offset + 1])) << 8)
        | (uint32(uint8(r.data[r.offset + 2])) << 16)
        | (uint32(uint8(r.data[r.offset + 3])) << 24);
    r.offset += 4;
    return val;
}

function _readQM31(ProofReader memory r) internal pure returns (uint128) {
    uint32 m0 = _readU32(r);
    uint32 m1 = _readU32(r);
    uint32 m2 = _readU32(r);
    uint32 m3 = _readU32(r);
    return M31Lib.qm31FromM31(m0, m1, m2, m3);
}

Proof Size Breakdown

For the withdrawal circuit (46 trace + 8 comp columns, 3 queries, 6 FRI layers). The transfer circuit uses 57 trace columns, increasing sampled values and queried values accordingly:

Section	Components	Bytes
Commitments	3 × 32 + headers	~100
Sampled values	48 trace × 16 + 8 comp × 16 + headers	~900
Decommitments	~25 hashes × 32 + headers	~820
Queried values	(46 + 8) × 3 × 4 + headers	~660
PoW nonce	8	8
FRI proof	7 commitments + witnesses + decomms	~2,500
Total		~4,808 bytes

5. Transcript Replay Correctness

The Solidity verifier must reproduce the exact same channel state as the Rust prover at every step. Any divergence means challenges (OOD point, FRI alphas, query positions) will differ, causing verification failure.

Verification Order

1. Initialize channel (digest = 0, nDraws = 0)
2. mixRoot(commit0)          // Preprocessed tree
3. mixRoot(commit1)          // Trace tree
4. drawSecureFelt()          // composition_random_coeff (consumed but unused)
5. mixRoot(commit2)          // Composition tree
6. drawSecureFelt()          // → oodsT (OOD point parameter)
7. mixFeltsFlat(sampled_values) // All OOD values as flat M31s
8. drawSecureFelt()          // → friRandomCoeff (α for quotients)
9. ... (FRI commitments: mixRoot + drawSecureFelt per layer) ...
10. verifyPowNonce(nonce)
11. mixU64(nonce)
12. drawU32s() × ⌈N_QUERIES / 8⌉  // → query positions

Testing Transcript Consistency

The Rust prover includes a debug test that outputs the channel digest at each step. These digest values are hardcoded in Solidity tests:

function test_channelAfterCommitments() public {
    // After mixing all 3 commitments + drawing composition coeff
    // Rust digest: 0xabcd...
    assertEq(channel.digest, 0xabcd...);
}

Any encoding mismatch (byte order, padding, field element order) causes immediate divergence visible in these tests.

6. Common Pitfalls

Endianness

Every integer must be little-endian. This is natural in Rust (to_le_bytes()) but requires explicit conversion in Solidity (which is big-endian). The _toLEBytes4 helper swaps byte order for every u32.

abi.encodePacked Ordering

Solidity's abi.encodePacked concatenates arguments in order. Ensure the concatenation order matches the Rust hasher.update() call order exactly.

nDraws Reset

Every mix operation resets nDraws to 0. The draw operations increment nDraws but do not modify the digest. This means consecutive draws produce different outputs (via the nDraws counter), while a mix followed by a draw always starts from nDraws = 0.

QM31 Element Order

A QM31 value $(m_0, m_1, m_2, m_3)$ is serialized as:

m0_LE4 ‖ m1_LE4 ‖ m2_LE4 ‖ m3_LE4

This matches Stwo's to_m31_array() order, which is [first.real, first.imag, second.real, second.imag].

Sampled Values Flattening

The sampled values are mixed as a flat array of M31 elements, NOT as QM31 elements. This means each QM31 sample contributes 4 consecutive M31 values to the flat array. The order is: tree 0 columns, then tree 1 columns, then tree 2 columns — with each column's samples in mask-offset order.