ensindexer-rs: an experiment in building a faster ENS indexer in Rust

ensindexer-rs is an experimental Rust implementation of an ENS subgraph-compatible indexer. It is not a production service today. The point of the project is to explore how much room there is to improve the performance of current ENS indexing approaches while keeping the user-facing GraphQL shape close to the official ENS subgraph.

This started as an AI-assisted engineering experiment: keep the ENS data model familiar, rebuild the indexing/runtime stack from scratch in Rust, and measure where practical performance wins are possible. The useful question is not “can this replace existing infrastructure tomorrow?” It is: “how much faster can the same kind of ENS data become if the pipeline is designed around replayability, batching, and query-specific indexes?”

You can try the public GraphQL playground here: ensindexer-rs.namespace.ninja/graphql.
GithHub Source: thenamespace/ensindexer-rs

High-level overview

At a high level, the indexer reads ENS logs, decodes them into typed events, applies projection logic, stores current and historical state in Postgres, and exposes a GraphQL API through async-graphql.

The architecture is split into a few clear layers:

  • contracts decodes ENS events using Alloy.
  • ingest handles RPC, HyperSync, raw archive replay, and live indexing.
  • projection turns ENS events into subgraph-style entities and event history.
  • storage owns Postgres tables, filters, indexes, snapshots, and batch writes.
  • api exposes the GraphQL schema, relationships, filters, ordering, _meta, and historical block reads.
  • server runs Axum, GraphQL, Apollo Sandbox, health routes, and optional indexing workers.

All historical sources go through the same projection path:

RPC / HyperSync / raw archive
  -> decoded ENS events
  -> projection handlers
  -> Postgres current rows, snapshots, events, checkpoints
  -> GraphQL

That is important because raw replay, HyperSync backfill, RPC backfill, and live indexing should not create different data models. They only change where logs come from.

Runtime features

The server is one binary. Starting it always starts the API, GraphQL endpoint, Apollo Sandbox, health checks, and readiness checks. Historical backfill and live indexing are explicit toggles.

Historical backfill can use:

  • BACKFILL_SOURCE=rpc
  • BACKFILL_SOURCE=hypersync
  • BACKFILL_SOURCE=raw

Live indexing can use:

  • LIVE_INDEXING_SOURCE=rpc

The most useful development feature is the raw archive path. Logs can be fetched once from HyperSync or RPC, written to local binary .bin files with checksums, and replayed later without spending network/API credits again. That makes projection and storage work much easier to iterate on.

Supported scope

The current scope is the ENS v1-style mainnet subgraph model: registry state, .eth registrar state, Name Wrapper state, resolver records, current entity reads, event history, _meta, filters, ordering, relationship fields, _change_block, and snapshot-backed historical block reads.

Benchmark results

The benchmark comparison uses the existing query fixtures against a full local mainnet database and hosted ENSNode/The Graph endpoints.

endpoint URL network baseline median network baseline p95
ensindexer-rs (Hosted) https://ensindexer-rs.namespace.ninja/subgraph 155.502ms 165.064ms
ensnode https://api.alpha.ensnode.io/subgraph 216.618ms 227.199ms
the graph indexer https://gateway.thegraph.com/api/subgraphs/id/5XqPmWe6gjyrJtFn9cLy237i4cWw2j9HcUJEXsP5qGtH 309.976ms 360.953ms

5 warmups, 25 measured iterations

operation ensindexer-rs (Hosted) ensnode the graph indexer
01-domain-batch 161.760ms (2.2x) 236.639ms (1.5x) 349.134ms (slowest)
02-names-for-address 165.385ms (2.1x) 230.997ms (1.5x) 341.654ms (slowest)
03-eth-subnames 172.312ms (8.9x) 1529.751ms (slowest) 394.077ms (3.9x)
04a-subnames-search-3-letter 478.278ms (3.6x) 1734.080ms (slowest) 370.343ms (4.7x)
04b-subnames-search-4-letter 289.240ms (6.2x) 1804.081ms (slowest) 386.593ms (4.7x)
04c-subnames-search-5-letter 231.535ms (7.4x) 1722.582ms (slowest) 597.656ms (2.9x)
05-decoded-label 154.759ms (2.3x) 219.236ms (1.6x) 359.036ms (slowest)
06-resolver-records 194.695ms (2.2x) 232.694ms (1.9x) 430.787ms (slowest)
07-registrations 188.012ms (4.3x) 802.861ms (slowest) 352.961ms (2.3x)
07a-subgraph-registrant 156.093ms (2.1x) 218.612ms (1.5x) 331.832ms (slowest)
08-name-history 170.191ms (2.0x) 229.845ms (1.5x) 343.990ms (slowest)
09-event-scan 243.978ms (24.4x) unsupported 5957.348ms (slowest)
10-relationship-filter 163.389ms (2.7x) unsupported 444.856ms (slowest)
11-text-search 282.678ms (1.8x) 352.325ms (1.5x) 511.326ms (slowest)

Relative speed is calculated against the slowest supported numeric result in each row. Unsupported, timeout, and error cells are excluded from the numeric baseline.

The current shape is pretty clear: most structured ENS queries are fast locally, especially address lookups, .eth subnames, decoded-label lookup, event scans, and relationship filters. Broad text search remains the hardest category. Trigram indexes help, but if a search term matches a large number of names, Postgres still has to fetch and sort a large candidate set.

Backfill performance

Backfill performance is harder to summarize because ENS history is uneven. Sparse early ranges, dense resolver ranges, and subdomain-heavy ranges behave very differently.

The raw replay path reached roughly 8.6k logs/sec overall in release mode, and around 9.3k logs/sec on 50k+ log ranges (Tested on M3 Macbook Pro) after range-wide snapshot batching. That is not the final ceiling, but it is enough to show that avoiding repeated RPC fetches and avoiding one-write-per-event behavior matters a lot.

The improvement came from making historical indexing behave less like “handle one event, write one row” and more like “understand the whole range, then write the range efficiently.”

The main pieces are:

  • Raw archives: fetch logs once from HyperSync or RPC, store them as binary range files with checksums, and replay them locally. This makes projection changes cheap to test because the expensive chain-history fetch is no longer part of every iteration.
  • Range preloading: scan the log range before applying projection and load likely-touched domains, registrations, resolvers, accounts, and wrapped-domain rows in batches. This cuts down on repeated point reads during dense ranges.
  • Replay-level cache: keep hot current-state rows in memory across archive files. .eth, popular parent domains, resolver contracts, and common accounts show up again and again; treating them as hot state avoids a lot of database roundtrips.
  • Batched writes: flush current entities, event rows, block metadata, entity changes, and snapshots in chunks instead of writing after every event. This is one of the biggest differences between an indexer that is easy to reason about and one that is fast enough to replay full history.
  • Parent-first domain flushing: write domains in parent-before-child order so domains.parent_id can stay as a real foreign key without fighting the insert order.
  • Bulk replay index windows: for large raw or HyperSync replays, temporarily drop secondary query indexes and recreate them after the write-heavy phase. Maintaining every read index row-by-row across tens of millions of writes is often more expensive than rebuilding those indexes once.

There is still room to push this further. The prototype is already fast enough to show that raw replay and batch-oriented storage matter, but full-history ENS backfill is still dominated by the densest resolver and subdomain eras.

Query performance improvements

Most query improvements came from matching indexes to real ENS query patterns.

A generic GraphQL-to-SQL layer is useful, but ENS has a few access patterns that are common enough to deserve special treatment. The fastest paths came from accepting that and indexing around those shapes directly.

The main query-side optimizations are:

  • Exact name and label lookup: use hash-backed indexes such as md5(name) and md5(label_name) with exact text rechecks. ENS names can be too long or weirdly shaped for naive btree text indexing, so this keeps exact lookup fast without depending on unsafe assumptions about string length.
  • Parent-scoped subname traversal: use parent-aware indexes such as domains_parent_idx, domains_parent_label_name_sort_idx, and domains_parent_name_sort_idx. This matters a lot for .eth, because a subname query should seek inside one parent and return an ordered page, not scan and sort the domain table.
  • Substring search: use Postgres trigram indexes on name, lower(name), label_name, and lower(label_name), plus parent-scoped variants for subname search. This makes normal search terms practical. Very broad terms can still produce huge candidate sets, so search remains the category with the most room for specialized work.
  • ENSJS names-for-address fast path: detect the common owner/registrant/wrapped-owner/resolved-address query shape and emit a direct indexed SQL plan instead of treating it as arbitrary nested GraphQL filters. Supporting indexes are ordered by expiry_date and created_at, because those are the sorts ENS apps tend to ask for.
  • Event history indexes: index domain, registration, and resolver events by parent entity and block number. That keeps Domain.events, Registration.events, Resolver.events, recent event scans, and historical block clamping from becoming table scans.
  • GraphQL relationship batching: use async-graphql DataLoader batching for owner, resolver, registrant, wrapped owner, registration, and wrapped domain relationships. One page of domains should not turn into hundreds of tiny relationship queries.

The next obvious query-performance layer is caching. ENS is unusually cache-friendly because the indexer already sees the events that make cached data stale.

  • A domain detail cache can be invalidated by registry, registrar, wrapper, and resolver events touching that node.
  • Resolver record caches can be invalidated directly from resolver events such as address, text, contenthash, ABI, pubkey, interface, and version changes.
  • Address/name list caches can be invalidated when ownership, registrant, wrapped owner, or resolved address relationships change.

That kind of cache would not need vague TTL-only invalidation. It can be event-driven: apply a block, collect the affected domain IDs, resolver IDs, account IDs, and parent IDs, then evict the exact GraphQL/read-model keys that depend on them.

The expected impact is highest on repeated reads. Hot domain detail, resolver profile, reverse-resolution, and ENSJS-style profile queries should mostly become memory or Redis reads instead of indexed Postgres reads. In practical terms, cache-hit queries should be expected to drop by roughly 50-90% in server-side compute time, depending on how many relationships and resolver records the query expands. Repeated subname pages and names-for-address pages should see a smaller but still meaningful drop, roughly 30-80%, because pagination, ordering, and larger result sets still need more work.

Technical stack

The core stack is Rust, Axum, Tower, async-graphql, Alloy, SQLx, Postgres 17, HyperSync, Docker Compose, and cargo-make. The project intentionally avoids The Graph runtime and indexing frameworks; the pipeline is custom Rust from ingestion through GraphQL.

Takeaways

The main takeaway is that ENS indexing has meaningful optimization headroom without changing the user-facing schema. A lot of the gains came from practical engineering rather than exotic ideas: replay logs locally, batch database work, cache current state during projection, use indexes that match real query shapes, and special-case the few GraphQL filters that are common enough to deserve custom SQL.

This does not mean the prototype is production-ready, or that existing ENS indexing infrastructure should be replaced as-is. It does suggest that a purpose-built Rust/Postgres indexer can stay close to the official subgraph model while being much faster on many common ENS workloads.

8 Likes

Happy to answer any questions here or hear any feedback if anyone’s testing it.