Full Report
The Horton principle is "mean what you sign and sign what you mean". The reference comes from a Dr. Seuss book but does have profound impact. When signing/hashing data, no changes should be possible to it. Although this sounds simple, even a single item is difficult to do correctly. With multiple items, this becomes very complex. For multiple items with hashing, it's common to turn the data into a string and hash. But, this just moves the problem to the serialization. By using something like ASN.1 or JSON or Protobuf, this mostly solves the problem though. These make the representation unambiguous, which isn't so simple to do. When doing this for a Merkle tree, a load of problems comes into affect. The depth, the data in the nodes and levels all matter for the tree. In a binary merkle tree, the representation of how we hash the leaves matters. What happens if it's not a perfect element of 2? Do we go from the left or the right? All of these matter for the Horton Principle. Domain separation is important - add byte separators for specific bytes. For instance, adding a 0x00 for a single entry element. In CometBFT, the hash byte slices are ordered and order-preserving. In Ethereum, they do something different - all unpopulated leaves are just empty. This means that the empty hash is used a lot, making it perfectly balanced in theory. Both of these satisfy the Horton principle. The Open Zeppelin library takes in an input of leaves of length L. It starts pairing from the tail and takes a double hash concatenation to disambiguate between internal nodes and single leaves. A major deviation is that the leaves are sorted prior to hashing. This means that the tree does not preserve the order of the input sequence. In the documentation, they say it's assumed that the inputs are ordered. A cross-chain protocol called Omni Network ported the Open Zeppelin version for their own use in Golang. Naturally, they forgot about the optionally of the leaves pair sorting and try to always sort. This is done by the blockchain for transmission but isn't actually required by the verifier. Great. The Omnichain Network merkle tree data contains a LogIndex that is a monotonically increasing value. However, this value is NOT in the data used for hashing. Combining the optionally of leaf sorting with the neglect of this value in the hash, leads to the ordering of messages not being preserved. Practically speaking, this means that differently ordered xchain.Msg sequences lead to the same Merkle Root. Even more crazy is that you can change the LogIndex to be different as well. {"Ping", 1} and {"Pong", 2} is just as valid as {"Pong", 1} and {"Ping", 2}. The ordering is not kept, as we can see. The author includes a PoC. How did this sneak through the cracks of the project? The author has a great quote on this: As seen too often with cryptographic constructions, too many moving parts and options become the ingredients for the proverbial "hazardous material".. With the copying of the porting from the Open Zeppelin with a weird assumption being made in the library, it was bound to lead to an issue. Great write up!
Analysis Summary
# Vulnerability: Merkle Root Collision due to Unpreserved Message Ordering in Omni Network
## CVE Details
- CVE ID: N/A (No official CVE assigned in the provided text)
- CVSS Score: N/A
- CWE: CWE-348 (Use of Externally-Controlled Format String) or CWE-352 (Cross-Site Request Forgery) depending on explicit impact, though fundamentally it relates to **CWE-327: Inappropriate Use of Cryptography** or **CWE-755: Improper Initialization of Resource** regarding construction context.
## Affected Systems
- Products: Omni Network (specifically its cross-chain protocol components leveraging Merkle trees, such as message sequencing/attestations).
- Versions: Versions of Omni Network ported from the OpenZeppelin Merkle tree implementation that erroneously assumed fixed leaf sorting and failed to incorporate the `LogIndex` into the hash. (Specific version numbers are not provided.)
- Configurations: Any configuration using the affected Merkle tree implementation for hashing sequences of cross-chain messages (xchain.Msg) where message order verification is critical.
## Vulnerability Description
The Omni Network ported a Merkle tree implementation derived from OpenZeppelin. The original OpenZeppelin implementation optionally sorts leaves prior to hashing, which breaks input sequence ordering preservation—a deviation documented in its usage assumptions.
Omni Network **always** sorts the leaves and, critically, **fails to include the `LogIndex`** (a monotonically increasing value meant to enforce order) in the data used for hashing the leaf nodes.
Because the leaf ordering is decided externally (by sorting enforced by the transmission layer) while the data used in the hash computation omits the order-defining `LogIndex`, differently ordered sequences of cross-chain messages (`xchain.Msg`) result in the same Merkle Root. For example, `{"Ping", 1}` and `{"Pong", 2}` can generate the same root as `{"Pong", 1}` and `{"Ping", 2}` (if the ordering rules cause the sorted leaves to match). This violates the Horton principle by failing to preserve the representation of the input structure.
## Exploitation
- Status: PoC available (Author provided a Proof of Concept demonstrating root collision).
- Complexity: Low (Exploiting the collision requires knowing the sorting assumption and serialization details, but demonstrating the collision is trivial based on the PoC).
- Attack Vector: Network (An attacker who can submit sequences of cross-chain messages could potentially control the final root or substitute one message sequence for another valid but different sequence).
## Impact
- Confidentiality: None (The vulnerability primarily concerns data integrity and sequence verification).
- Integrity: High (An attacker can substitute one sequence of cross-chain messages for another with the same Merkle Root, potentially leading to unauthorized actions or invalid state transitions if the root is used for attestation verification).
- Availability: Low (Minimal impact on system uptime itself, but integrity failure could lead to service disruption/forks).
## Remediation
### Patches
- Patches are not specified in the text, but the required fix involves updating how leaf data is computed before hashing.
- **Required Fix:** Modify the leaf encoding function (`encodeMsg` or equivalent) to *include* the `LogIndex` as part of the data being hashed for each leaf, ensuring that variation in the index yields a different hash, thus preserving message sequence integrity.
### Workarounds
- **Temporary Mitigation:** If possible, ensure that all components (transmission and verification) strictly agree on the exact ordering convention *and* ensure that the data being hashed for verification unequivocally captures the intended sequence structure, ideally by reverting to a standard that includes domain separation or explicitly hashes the sequence index alongside the data content.
## Detection
- Indicators of Compromise: Observation of different, structurally distinct sequences of `xchain.Msg` inputs generating identical Merkle Roots across the system.
- Detection Methods and Tools: Manual inspection of Merkle tree construction paths in Omni Network components (e.g., `omni/lib/xchain/merkle.go` and related serialization logic) to confirm that the `LogIndex` is participating in the leaf hash computation.
## References
- Vulnerability write-up: hxxps://sbudella.altervista.org/blog/20250507-hash-what-you-mean.html
- PoC Example: hxxps://sbudella.altervista.org/blog/20250507-omni-merkle.patch