[Draft-2] EIP-X: Off-Chain Data Write Protocol

NameSys · February 6, 2024, 4:14am

Hello ENS & ETH Developers

I attach the second draft incorporating some comments from Nick on Draft-1. This proposal is work in progress and I’ll append a section on interface-based strategy with an example (instead of revert-based) soon.

GitHub: Draft-2

Off-Chain Data Write Protocol

Update to Cross-Chain Write Deferral Protocol (EIP-5559) incorporating secure write deferrals to centralised databases and decentralised & mutable storages

Abstract

The following proposal is an update to EIP-5559: Off-Chain Write Deferral Protocol, targeting a wider set of storage types and introducing security measures to consider for secure off-chain write deferral and retrieval. While EIP-5559 is limited to deferring write operations to L2 EVM chains and centralised databases, methods in this document enable secure write deferral to generic decentralised storages - mutable or immutable - such as IPFS, Arweave, Swarm etc. This draft alongside EIP-3668 and EIP-5559 is a significant step toward a complete and secure infrastructure for off-chain data retrieval and write deferral.

Motivation

EIP-3668, or ‘CCIP-Read’ in short, has been key to retrieving off-chain data for a variety of contracts on Ethereum blockchain, ranging from price feeds for DeFi contracts, to more recently records for ENS users. The latter case is more interesting since it dedicatedly uses off-chain storage to bypass the usually high gas fees associated with on-chain storage; this aspect has a plethora of use cases well beyond ENS records and a potential for significant impact on universal affordability and accessibility of Ethereum.

Off-chain data retrieval through EIP-3668 is a relatively simpler task since it assumes that all relevant data originating from off-chain storages is translated by CCIP-Read-compliant HTTP gateways; this includes L2 chains, centralised databases or decentralised storages. On the flip side however, so far each service leveraging CCIP-Read must handle two main tasks externally:

writing this data securely to these storage types on their own, and
incorporating reasonable security measures in their CCIP-Read compatible contracts for verifying this data before performing on-chain read or write operations.

Writing to a variety of centralised and decentralised storages is a broader objective compared to CCIP-Read largely due to two reasons:

Each storage provider typically has its own architecture that the write operation must comply with, e.g. they may require additional credentials and configuration to able to write data to them, and
Each storage must incorporate some form of security measures during write operations so that off-chain data’s integrity can be verified by CCIP-Read contracts during data retrieval stage.

EIP-5559 was the first step toward such a tolerant ‘CCIP-Write’ protocol which outlined how write deferrals could be made to L2 and centralised databases. The cases of L2 and database are similar; deferral to an L2 involves routing the eth_call to L2, while deferral to a database can be made by extracting eth_sign from eth_call and posting the resulting signature along with the data for later verification. In both cases, no pre-flight information needs to be processed by the client and arguments of eth_call and eth_sign as specified in EIP-5559 are sufficient. This proposal extends the previous attempt by including secure write deferrals to decentralised storages, especially those which - beyond the arguments of eth_call and eth_sign - require additional pre-flight metadata from clients to successfully host users’ data on their favourite storage. This document also enables more complex and generic use-cases of databases such as those which do not store the signers’ addressess on chain as presumed in EIP-5559.

Curious Case of Decentralised Storages

Decentralised storages powered by cryptographic protocols are unique in their diversity of architectures compared to centralised databases or L2 chains, both of which have canonical architectures in place. For instance, write calls to L2 chains can be generalised through the use of ChainId for any given callData; write deferral in this case is as simple as routing the eth_call to another contract on an L2 chain. There is no need to incorporate any additional security requirement(s) since the L2 chain ensures data integrity locally, while the global integrity can be proven by employing a state verifier scheme (e.g. EVM-Gateway) during CCIP-Read calls. Centralised databases have a very similar architecture where instead of invoking eth_call, the result of eth_sign needs to be posted on the database along with the callData for integrity verification by CCIP-Read.

Decentralised storages on the other hand, do not typically have EVM- or database-like environments and may have their own unique content addressing requirements. For example, IPFS, Arweave, Swarm etc all have unique content identification schemes as well as their own specific fine-tunings and/or choices of cryptographic primitives, besides supporting their own cryptographically secured namespaces. This significant and diverse deviation from EVM-like architecture results in an equally diverse set of requirements during both the write deferral operation as well as the subsequent state verifying stage.

For example, consider a scenario where the choice of storage is IPNS or ArNS. In precise terms, IPNS storage refers to immutable IPFS content wrapped in mutable IPNS namespace, which eventually serves as the reference coordinate for off-chain data. The case of ArNS is similar; ArNS is immutable Arweave content wrapped in mutable ArNS namespace. To write to IPNS or ArNS storage, the client requires more information than only the gateway URL responsible for write operations and arguments of eth_sign. More precisely, the client must at least prompt the user for their IPNS or ArNS signature which is necessary for updating the namespaced storage. The client may also require additional information from the user such as specific arguments required by IPNS or ArNS signature. One such example is the requirement of integer sequence of IPNS update which goes into the construction of signature message payload. These additional user-centric requirements are not accommodated by EIP-5559, and the resolution of these issues - among others such as batch writing - is detailed in the following attempt towards a suitable CCIP-Write specification.

Specification

Overview

The following specification revolves around the structure and description of an arbitrary off-chain storage handler tasked with the responsibility of writing to an arbitrary storage. First introduced in EIP-5559, the protocol outlined herein expands the capabilities of the StorageHandledBy*() revert to accept decentralised and namespaced storages. In addition, this draft proposes that besides StorageHandledByL2() and StorageHandledByOffChainDatabase(), new StorageHandledBy*() reverts be allowed through a publicly curated listed where each new StorageHandledBy*() storage handler must be accompanied by a complete documentation of its interface and design. Some foreseen examples of new storage handlers include StorageHandledByIPFS() for IPFS, StorageHandledByIPNS() for IPNS, StorageHandledByArweave() for Arweave, StorageHandledByArNS() for ArNS, StorageHandledBySwarm() for Swarm etc.

Similar to EIP-5559, a CCIP-Write deferral call to an arbitrary function setValue(bytes32 key, bytes32 value) can be described in pseudo-code as follows:

// Define revert event
error StorageHandledBy*(...)

// Generic function in a contract
function setValue(
    bytes32 key,
    bytes32 value
) external {
    // Get all necessary metadata from contract
    // Should typically contain coordinates to user's data
    config onChainConfig = getInfoFromContract(...);
    // Defer write call with relevant on-chain information
    revert StorageHandledBy*(onChainConfig, ...);
}

where, the following structure for StorageHandledBy*() must be followed:

// Details of revert event
error StorageHandledBy*(
    bytes msg.sender, // Sender of call
    bytes callData, // Payload to store
    config onChainConfig // Send all necessary data from contract
);

Config

The type config captures all the relevant information that the client may require from the contract to update a user’s data on their favourite storage. For instance, config should contain the public coordinates to the user’s data if such data exists on chain. In the case of StorageHandledByL2() for example, config must contain a chain identifier such as ChainId and additionally the contract address. In case of StorageHandledByOffChainDatabase(), config must contain the custom gateway URL serving a user’s data or some form of unique identifier required by the client to locate the user’s data. In case of StorageHandledByIPNS(), config may contain the public key of a user’s IPNS container that has been stored on chain; the case of ArNS is similar. In addition to data’s location, the contract may further contain security-driven information such as a delegated signer’s address who is tasked with signing the off-chain data; such authorities must also accompany the revert for verification tasks to be performed by the client. It follows that each storage handler StorageHandledBy*() must define the precise construction of their chosen config in their documentation. One generic construction of config which supports L2, databases, IPFS, Arweave, IPNS, ArNS and Swarm[?] is given below.

// Config Type
type config = [
        bytes[] | string[],
        bytes[] | string[] | address[],
        bytes[] | string[],
        ...
    ];
// Data inside config
config onChainConfig = [
        coordinates | [], // List of coordinates (must exist on chain), e.g. ChainId for L2, URL or identifier for off-chain storage, public key for off-chain namespaced & decentralised storages etc
        authorities | [], // List of addresses of authorities (must exist on chain for L2; optional otherwise), e.g. contract address for L2, custom on-chain signer for other storages (if they exist on chain) etc
        accessories | [], // List of extra information that the client must evaluate; typically empty except for decentralised namespaced storages, e.g. user's IPNS or ArNS public key
        ...
    ];

L2 Handler

A mimimal L2 handler only requires the list of ChainId values and the corresponding contract addresses and StorageHandledByL2() as defined in EIP-5559 is sufficient. In context of this proposal, ChainId and contract must be part of the config. There may however arise a situation where a service first stores some data on L2 and then writes - asynchronously or otherwise - to another off-chain storage type; in such cases, config may additionally contain the necessary metadata to write to off-chain storage.

EXAMPLE

config onChainConfig = [
        [
            "11",
            "23",
            ...
        ], // ChainId values identifying the chains
        [
            "0xc0ffee254729296a45a3885639AC7E10F9d54979",
            "0x75b6B7CEE3719850d344f65b24Db4B7433Ca6ee4",
            ...
        ], // Contract addresses on chains
        ...
    ];

The deferral in this case will prompt the client to submit the transaction to the relevant L2 as prescribed by the incoming config.

Database Handler

A minimal database handler is similar to an L2 in the sense that:

a) it requires the coordinates in form of gateway urls responsible for handling off-chain write operations (similar to ChainId), and

b) it should require eth_sign output to secure the data and the client must prompt the users for these signatures (similar to eth_call).

In this case, the config consists of the bespoke urls as coordinates, and the addresses of signers (of eth_sign) take the place of authorities. The client must make sure that the signatures forwarded to the gateways match the addresses in authorities. If some gateways don’t implement signatures, then clients could choose not to support those service providers since off-chain read-in without cryptographic verification methods is unsafe practise; there may however be exceptions to this due to which authorities are allowed to be empty in this proposal.

EXAMPLE

config onChainConfig = [
        [
            "https://api.service.net",
            "wss://service.write.com",
            ...
        ], // URLs or other identifiers
        [
            "0xc0cac0254729296a45a3885639AC7E10F9d54979",
            "",
            "0xcafec0laE3719850d344f65b24Db4B7433Ca6ee4",
            ...
        ], // Custom signers (if they exist on chain)
        ...
    ];

In the above example, the client must prompt the user for a signature corresponding to each non-empty value in authorities, verify that the signature matches the value in authorities and pass the resulting signature to the respective gateway URL.

Decentralised Storage Handler

Decentralised storages are the extremest in the sense that they come both in immutable and mutable form; the immutable forms locate the data through immutable content identifiers (CIDs) while mutable forms utilise some sort of namespace which can statically reference any dynamic content. Examples of the former include raw content hosted on IPFS and Arweave while the latter forms use IPNS and ArNS namespaces respectively to reference the raw and dynamic content.

The case of immutable forms is similar to a database although these forms are not as useful in practise so far. This is due to the difficulty associated with posting the unique CID on chain each time a storage update is made. One way to bypass this difficulty is by storing the CID cheaply in an L2 contract; this method requires the client to update the data on both the decentralised storage as well as the L2 contract through two independent deferrals. CCIP-Read in this case is also expected to read from two storages to be able to fully handle a read call. Contrary to this tedious flow, namespaces can instead be used to statically fetch immutable CIDs. For example, instead of a direct reference to immutable CIDs, IPNS and ArNS public keys can instead be used to refer to IPFS and Arweave content respectively; this method doesn’t require dual deferrals by CCIP-Write (or CCIP-Read), and the IPNS or Arweave public key needs to be stored on chain only once. However, accessing the IPNS and ArNS content now requires that the client must prompt the user for additional information via accessories, e.g. IPNS and ArNS signatures in order to update the data.

Decentralised storage handlers are therefore bestowed with the ability to revert with additional accessories which the clients must interpret and evaluate before calling the gateway with the results. This feature is not supported by EIP-5559 and services using EIP-5559 are thus incapable of storing data on decentralised namespaced & mutable storages.

EXAMPLE

config onChainConfig = [
        [
            "https://ipns.namesys.xyz", // Gateway 0
            "wss://api.ipns.public.io", // Gateway 1
            "https://api.arns.ens.com", // Gateway 2
            ...
        ], // coordinates; URLs or other identifiers
        [
            "0xc0cac0254729296a45a3885639AC7E10F9d54979", // Signer 0
            "", // Signer 1
            "0xcafec0laE3719850d344f65b24Db4B7433Ca6ee4", // Signer 2
            ...
        ], // authorities; custom signers (if they exist on chain)
        [
            "0xe50101720024080112203fd7e338b2de90159832ffcc434927da8bbfc3a000fa58ea0548aa8e08f7e10a", // Multicodec-encoded IPNS public key; Pubkey 0
            "0x55fb762f2744b86e98bb05d7816e2eafa26054642725b709f6430f9102bb0b27", // Multicodec-encoded shortened IPNS public key; Pubkey 1
            "0x89d50a253a427f5060d1c2c6b512e308e822cbac37d8e82bc32e597c853856d4f60", // Hex-encoded ArNS public key; Pubkey 2
            ...
        ], // accessories; public keys of IPNS/ArNS or other namespaces (if they exist on chain)
        ...
    ];

In this example, the client must process each non-empty item in the accessories list according to the specifications of the native StorageHandledBy*() identifier. For instance, in the particular example shown above, the client must request the user for at least,

a sequence counter and an IPNS signature for accessories[0],
a sequence counter and an IPNS signature for accessories[1], and
an ArNS signature for accessories[2].

These procedures must be described in the documentation of StorageHandledBy*() and they must explicitly detail that clients should evaluate the accessories by feeding the sequence counters to the message payloads and then obtaining the resulting IPNS signatures for accessories[0] and accessories[1]. These signatures must then be passed to the gateway among other arguments. The documentation must also define the precise formatting of message payloads, any custom cryptographic techniques implemented for additional security, accessibility or privacy.

Events

A public library must be maintained where each new storage handler must register their StorageHandledBy*() identifier. This library must exist in public domain and it should be the sole accepted standard for CCIP-Write infrastructure providers similar to multiformats & multicodec tables.
Each StorageHandledBy*() provider must be supported with detailed documentation of their infrastructure along with a Protocol Improvement Proposal.

estmcmxci · February 6, 2024, 7:25pm

Since this proposal intends to update EIP-5559 in order to enhance compatibility with decentralized and mutable storage solutions such as IPFS, Arweave, Swarm, wouldn’t this be more relevant to the broader Ethereum community and its core developers, rather than being specifically targeted at ENS? Or, are you looking for ENS to adopt this proposal and then propose it to Ethereum core devs?

NameSys · February 7, 2024, 1:19am

This draft has nothing to do with ENS, and it doesn’t mention ENS anywhere in the protocol’s description. Having said that, this is a very early stage proposal and still undergoing pre-pre-print discussions. Since Nick has (co-)authored both EIP-3668 and EIP-5559, it makes sense to pass the proposal through him and take his invaluable input. Once it is in a mature state, it can be posted to Ethereum Magicians for a wider round of comments from random/core Ethereum devs.

estmcmxci · February 7, 2024, 1:28am

That makes sense, thank you for clarifying.