Verifying Merkle Proofs on Algorand

Merkle trees are an extremely useful data structure for tamper-proof collections.

They are ubiquitous in blockchains.

There are many potential use cases for smart contracts that can verify an inclusion proof, so let's look at how we can write one with PuyaPy.

Background

The Ethereum website provides a great explanation of Merkle proofs:

To prove a certain value, you provide all the hashes that need to be combined with it to obtain the root. For example, to prove C you provide D, H(A-B), and H(E-H).

Proof of the value of C

Author: @nhsz

H(A + B) != H(B + A) , so a proof would normally include an indicator of whether each node is on the left or right side, as well as the node's hash value.

For example, if we start with leaf C and a proof [("right", D), ...]) , then we know that the first step is to hash C + D.

Adding the side indicator of course increases the size of the serialised proof.

But there are techniques that eliminate the need for it altogether.

A common approach used in Amazon QLDB, OpenZeppelin, and merkletreejs is to always append the greater hash to the smaller hash.

In other words, if A < B, hash A + B. Otherwise, hash B + A.

This is neat, and since it's so widely used in other blockchains, we'll follow the same method in this smart contract.

The Smart Contract

Interface

The OpenZeppelin v5 MerkleProof contract has a verify function with the following signature:

verify(bytes32[] proof, bytes32 root, bytes32 leaf) → bool

I don't think there's an equivalent for bytes32[] in Puya yet, so the contract method will have this signature:

def verify(proof: Bytes, root: Bytes, leaf: Bytes) -> bool:

Functions

OpenZeppelin defines this function in their Solidity contract:

/**
    * @dev Sorts the pair (a, b) and hashes the result.
    */
function _hashPair(bytes32 a, bytes32 b) private pure returns (bytes32) {
    return a < b ? _efficientHash(a, b) : _efficientHash(b, a);
}

So let's create an equivalent in Puya:

@subroutine
def hash_pair(a: Bytes, b: Bytes) -> Bytes:
    """Sorts the pair (a, b) and hashes the result.

    Args:
        a (Bytes): The first value.
        b (Bytes): The second value.

    Returns:
        Bytes: The hash of the sorted pair.
    """
    return op.sha256(a + b if BigUInt.from_bytes(a) < BigUInt.from_bytes(b) else b + a)

The Bytes type in Puya doesn't support the < operator, so we convert the Bytes to BigUInt before comparing the values.

Methods

The contract's verify method can be called from off chain code:

HASH_LENGTH = 32

class Merkle(ARC4Contract):
    """A contract for verifying Merkle proofs."""

    @arc4.abimethod
    def verify(self, proof: Bytes, root: Bytes, leaf: Bytes) -> bool:
        """Verify a Merkle proof.

        Args:
            proof (Bytes): Array of 32 byte hashes.
            root (Bytes): The root hash.
            leaf (Bytes): The leaf hash.

        Returns:
            bool: True if the proof is valid, else False.
        """
        computed_hash: Bytes = leaf
        for i in urange(0, proof.length, HASH_LENGTH):
            computed_hash = hash_pair(computed_hash, op.extract(proof, i, HASH_LENGTH))
        return computed_hash == root

Checking It Works

I'll reuse the example from my previous post, 'Provable NFT Collections on Algorand':

const { MerkleTree } = require("merkletreejs");
const CryptoJS = require("crypto-js");

const assetIds = [1, 2, 3];
const leaves = assetIds.map((x) => CryptoJS.SHA256(x.toString(16)));
const tree = new MerkleTree(leaves, CryptoJS.SHA256, { sortPairs: true });
const root = tree.getRoot().toString("hex");

Here we define a collection with 3 assets and get the Merkle root:

0932f1d2e98219f7d7452801e2b64ebd9e5c005539db12d9b1ddabe7834d9044

To obtain a proof that asset 1 is included in the Merkle tree, we run:

const leaf = CryptoJS.SHA256("1");
const proof = tree.getProof(leaf);
console.log(tree.getHexProof(leaf));

Which prints:

['0x6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b', '0x4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce']

Now we can convert these values to bytes and pass them to the app client:

>>> proof = [
...     "0x6b86b273ff34fce19d6b804eff5a3f5747ada4eaa22f1d49c01e52ddb7875b4b",
...     "0x4e07408562bedb8b60ce05c1decfe3ad16b72230967de01f640b7e4729b49fce",
... ]
>>> leaf = "d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35"
>>> root = "0932f1d2e98219f7d7452801e2b64ebd9e5c005539db12d9b1ddabe7834d9044"

>>> xtob = lambda x: bytes.fromhex(x.removeprefix("0x"))

>>> is_valid_proof = app_client.verify(
...     proof=b"".join(xtob(x) for x in proof),
...     root=xtob(root),
...     leaf=xtob(leaf),
... ).return_value

>>> print(f"{is_valid_proof=}")
is_valid_proof=True

And the contract has successfully verified the proof.

Complete Code Example

from puyapy import ARC4Contract, BigUInt, Bytes, arc4, op, subroutine, urange

HASH_LENGTH = 32


@subroutine
def hash_pair(a: Bytes, b: Bytes) -> Bytes:
    """Sorts the pair (a, b) and hashes the result.

    Args:
        a (Bytes): The first value.
        b (Bytes): The second value.

    Returns:
        Bytes: The hash of the sorted pair.
    """
    return op.sha256(a + b if BigUInt.from_bytes(a) < BigUInt.from_bytes(b) else b + a)


class Merkle(ARC4Contract):
    """A contract for verifying Merkle proofs."""

    @arc4.abimethod
    def verify(self, proof: Bytes, root: Bytes, leaf: Bytes) -> bool:
        """Verify a Merkle proof.

        Args:
            proof (Bytes): Array of 32 byte hashes.
            root (Bytes): The root hash.
            leaf (Bytes): The leaf hash.

        Returns:
            bool: True if the proof is valid, else False.
        """
        computed_hash: Bytes = leaf
        for i in urange(0, proof.length, HASH_LENGTH):
            computed_hash = hash_pair(computed_hash, op.extract(proof, i, HASH_LENGTH))
        return computed_hash == root