Drop BlockTreeEntry from the Bitcoin Kernel API
Published 08 Jun 2026There are two fundamental issues with the BlockTreeEntry type in the Bitcoin
Kernel API.
First, the name exposes implementation details that should remain internal. A
Bitcoin node naturally maintains a tree-like structure of competing block (or
block-header) candidates while determining the best chain. However, this
internal representation is abstracted away from Kernel API clients. The API
exposes only a linear view of a chain, allowing traversal back to genesis
through repeated calls to btck_block_tree_entry_get_previous. Consequently,
the term BlockTreeEntry implies an implementation detail that is neither
visible nor relevant to API consumers.
Second, the API leaks details of the current implementation through its traversal interface:
const btck_BlockTreeEntry*
btck_block_tree_entry_get_previous(
const btck_BlockTreeEntry* self);
This function exposes the predecessor relationship as a property of an
individual BlockTreeEntry, effectively encoding Bitcoin Core's current
implementation into the public API. That coupling limits future refactoring,
which is precisely one of the motivations for introducing a stable kernel
interface. Ideally, relationships between blocks would be managed by an
abstraction responsible for maintaining the relevant invariants, allowing the
underlying representation to evolve without affecting API consumers.
The Bitcoin Kernel API already provides such an abstraction: Chain. If we view
btck_chain_get_height as equivalent to Chain::size() - 1 and
btck_chain_get_by_height as Chain::operator[], then Chain begins to
resemble a sized,
random-access
view.
Furthermore, if we interpret btck_block_tree_entry_get_block_header as the
equivalent of iterator dereference (operator*), then BlockHeader naturally
becomes Chain::value_type and BlockTreeEntry becomes Chain::iterator.
This leads to a rather elegant C++ representation in which container-like
observers such as front(), back(), size(), empty(), and operator[]
are deduced by the base implementation.
This raises an interesting question: should every occurrence of BlockTreeEntry
in the C interface correspond to Chain::iterator in C++?
class Chain : public std::ranges::view_interface<Chain>
{
public:
using value_type = BlockHeader;
using iterator = /* ... */;
iterator begin() const; // points to genesis
iterator end() const; // points past the tip
bool Contains(iterator it) const; // ???
};
class ChainMan
{
public:
Chain ActiveChain() const;
std::optional<Chain::iterator> // ???
Find(BlockHash const& hash) const;
};
The problem is that an iterator alone is not a sufficient replacement for
BlockTreeEntry. An iterator is meaningful only relative to a particular chain.
Given a Chain::iterator, we cannot meaningfully traverse in either direction
without knowing the chain to which it belongs. The referenced block may be part
of a side branch rather than the active chain, and the iterator itself carries
no information about the corresponding begin and end positions.
This suggests that the appropriate replacement for BlockTreeEntry is not an
iterator, but a (Chain, iterator) pair. The Chain provides the context
required for iteration, while the iterator identifies a particular position
within that chain. Such a representation preserves the semantics of a block
reference without exposing predecessor links or block-tree nodes.
However, this immediately raises another question: what information does the iterator contribute that is not already contained in the chain?
Given a block, there is exactly one path from genesis to that block. In other words, every block uniquely determines a chain whose tip is that block. A function such as
Chain ChainMan::Find(BlockHash const& hash) const;
can therefore return a Chain ending at the requested block. The tip of the
returned chain identifies the block itself, while the remainder of the chain
provides all traversal context. The empty state observer (operator bool())
inherited from std::ranges::view_interface naturally represents "not found",
making std::optional unnecessary.
From this perspective, the (Chain, iterator) pair collapses into a single
Chain object. Clients can obtain the block header via back(), ancestors via
operator[], and height via size() - 1, all without exposing implementation
details of the underlying block index.
The real question is not whether BlockTreeEntry should become an iterator, but
whether BlockTreeEntry should exist in the public API at all, or whether
Chain is already the more fundamental abstraction.
If we replace BlockTreeEntry with Chain throughout the interface, that also
includes the function Chain::Contains. But since all chains begin at genesis,
the operation actually answers whether one chain is a prefix of another, even if
the implementation merely compares the block hash at height other.size() - 1.
This suggests that starts_with is the more appropriate name. The relationship
being tested is not one of generic containment, but of one chain forming an
initial segment of another. The name also has precedent in the standard library,
where std::ranges::starts_with expresses the same operation for arbitrary
ranges. Reusing established terminology makes the semantics immediately
recognizable to users familiar with the standard ranges vocabulary.
At the same time, there is precedent for providing a member function whose
observable behavior matches that of a generic algorithm while offering better
complexity. For example, std::set::find and std::unordered_set::find
express the same operation as std::find, but can exploit knowledge of the
underlying representation to avoid a linear search. Likewise,
Chain::starts_with could have the same semantics as
std::ranges::starts_with while leveraging chain-specific invariants to answer
the query in constant time. The member function therefore serves not to define
a new operation, but to provide a more efficient implementation of an existing
one.
A parallel opportunity exists for another operation. Bitcoin Core calls it
FindFork internally, while the corresponding standard library algorithm is
std::ranges::mismatch. Both identify the first position at which two sequences
diverge, which in the context of chains corresponds to the last common ancestor
before a fork.
Expressed in terms of chains, this operation determines the longest common prefix of two chains beginning at genesis. Once again, the semantics are purely range-based, and the result is independent of any particular internal representation.
This suggests introducing a member function Chain::mismatch. As with
starts_with, the function would have the same observable behavior as
std::ranges::mismatch, but could be implemented more efficiently by exploiting
chain-specific structure (such as cached ancestry or indexed heights) rather
than performing a linear comparison from genesis.
In this sense, both starts_with and mismatch illustrate a consistent design
pattern: whenever a Kernel API operation corresponds directly to a standard
library algorithm, the API should adopt the standard name while exposing it as a
member function when the underlying domain allows for asymptotically better
implementations.
Adopting Chain as the primary abstraction in place of BlockTreeEntry yields
a cleaner, more stable, and more idiomatic Kernel API. It hides Bitcoin Core's
internal block-tree implementation details, reduces coupling, and aligns the
interface with modern C++ range concepts that developers already understand. By
embracing vocabulary and patterns from the standard library (starts_with,
mismatch, random-access views, etc.), the API becomes more intuitive,
future-proof, and easier to maintain as the underlying data structures evolve.