[Draft-3] EIP-X: Off-Chain Data Write Protocol

Hello ENS & ETH Developers :wave:

I attach the third draft incorporating offline comments from Nick, Makoto and others on Draft-2. In particular, this draft incorporates ENSIP-16-like metadata() interface in the EIP and includes an example of a minimal off-chain resolver capable of making CCIP-Write calls.

GitHub: Draft-2


Off-Chain Data Write Protocol

Update to Cross-Chain Write Deferral Protocol (EIP-5559) incorporating secure write deferrals to centralised databases and decentralised & mutable storages

Abstract

The following proposal is an update to EIP-5559: Off-Chain Write Deferral Protocol, targeting a wider set of storage types and introducing security measures to consider for secure off-chain write deferral and retrieval. While EIP-5559 is limited to deferring write operations to L2 EVM chains and centralised databases, methods in this document enable secure write deferral to generic decentralised storages - mutable or immutable - such as IPFS, Arweave, Swarm etc. This draft alongside EIP-3668 and EIP-5559 is a significant step toward a complete and secure infrastructure for off-chain data retrieval and write deferral.

Motivation

EIP-3668, or ‘CCIP-Read’ in short, has been key to retrieving off-chain data for a variety of contracts on Ethereum blockchain, ranging from price feeds for DeFi contracts, to more recently records for ENS users. The latter case is more interesting since it dedicatedly uses off-chain storage to bypass the usually high gas fees associated with on-chain storage; this aspect has a plethora of use cases well beyond ENS records and a potential for significant impact on universal affordability and accessibility of Ethereum.

Off-chain data retrieval through EIP-3668 is a relatively simpler task since it assumes that all relevant data originating from off-chain storages is translated by CCIP-Read-compliant HTTP gateways; this includes L2 chains, centralised databases or decentralised storages. On the flip side however, so far each service leveraging CCIP-Read must handle two main tasks externally:

  • writing this data securely to these storage types on their own, and

  • incorporating reasonable security measures in their CCIP-Read compatible contracts for verifying this data before performing on-chain read or write operations.

Writing to a variety of centralised and decentralised storages is a broader objective compared to CCIP-Read largely due to two reasons:

  1. Each storage provider typically has its own architecture that the write operation must comply with, e.g. they may require additional credentials and configuration to able to write data to them, and

  2. Each storage must incorporate some form of security measures during write operations so that off-chain data’s integrity can be verified by CCIP-Read contracts during data retrieval stage.

EIP-5559 was the first step toward such a tolerant ‘CCIP-Write’ protocol which outlined how write deferrals could be made to L2 and centralised databases. The cases of L2 and database are similar; deferral to an L2 involves routing the eth_call to L2, while deferral to a database can be made by extracting eth_sign from eth_call and posting the resulting signature along with the data for later verification. In both cases, no pre-flight information needs to be processed by the client and arguments of eth_call and eth_sign as specified in EIP-5559 are sufficient. This proposal extends the previous attempt by including secure write deferrals to decentralised storages, especially those which - beyond the arguments of eth_call and eth_sign - require additional pre-flight metadata from clients to successfully host users’ data on their favourite storage. This document also enables more complex and generic use-cases of databases such as those which do not store the signers’ addressess on chain as presumed in EIP-5559.

Curious Case of Decentralised Storages

Decentralised storages powered by cryptographic protocols are unique in their diversity of architectures compared to centralised databases or L2 chains, both of which have canonical architectures in place. For instance, write calls to L2 chains can be generalised through the use of ChainId for any given callData; write deferral in this case is as simple as routing the eth_call to another contract on an L2 chain. There is no need to incorporate any additional security requirement(s) since the L2 chain ensures data integrity locally, while the global integrity can be proven by employing a state verifier scheme (e.g. EVM-Gateway) during CCIP-Read calls. Centralised databases have a very similar architecture where instead of invoking eth_call, the result of eth_sign needs to be posted on the database along with the callData for integrity verification by CCIP-Read.

Decentralised storages on the other hand, do not typically have EVM- or database-like environments and may have their own unique content addressing requirements. For example, IPFS, Arweave, Swarm etc all have unique content identification schemes as well as their own specific fine-tunings and/or choices of cryptographic primitives, besides supporting their own cryptographically secured namespaces. This significant and diverse deviation from EVM-like architecture results in an equally diverse set of requirements during both the write deferral operation as well as the subsequent state verifying stage.

For example, consider a scenario where the choice of storage is IPNS or ArNS. In precise terms, IPNS storage refers to immutable IPFS content wrapped in mutable IPNS namespace, which eventually serves as the reference coordinate for off-chain data. The case of ArNS is similar; ArNS is immutable Arweave content wrapped in mutable ArNS namespace. To write to IPNS or ArNS storage, the client requires more information than only the gateway URL responsible for write operations and arguments of eth_sign. More precisely, the client must at least prompt the user for their IPNS or ArNS signature which is necessary for updating the namespaced storage. The client may also require additional information from the user such as specific arguments required by IPNS or ArNS signature. One such example is the requirement of integer sequence of IPNS update which goes into the construction of signature message payload. These additional user-centric requirements are not accommodated by EIP-5559, and the resolution of these issues - among others such as batch writing - is detailed in the following attempt towards a suitable CCIP-Write specification.

Specification

Overview

The following specification revolves around the structure and description of an arbitrary off-chain storage handler tasked with the responsibility of writing to an arbitrary storage. First introduced in EIP-5559, the protocol outlined herein expands the capabilities of the StorageHandledBy<>() revert to accept decentralised and namespaced storages. In addition, this draft proposes that StorageHandledByL2() and StorageHandledByOffChainDatabase() introduced in EIP-5559 be updated, and new StorageHandledBy<>() reverts be allowed through a publicly curated list where each new StorageHandledBy<>() storage handler must be accompanied by a complete documentation of its interface and design. Some foreseen examples of new storage handlers include StorageHandledByIPFS() for IPFS, StorageHandledByIPNS() for IPNS, StorageHandledByArweave() for Arweave, StorageHandledByArNS() for ArNS, StorageHandledBySwarm() for Swarm etc.

Similar to EIP-5559, a CCIP-Write deferral call to an arbitrary function setValue(bytes32 key, bytes32 value) can be described in pseudo-code as follows:

// Define revert event
error StorageHandledByBOB(address sender, bytes callData, bytes4 funcSelector);

// Define metadata API interface
function metadata(
    bytes calldata key // Reference a key
)
    external
    view
    returns (metadata)
{
    // Return on-chain metadata for a node OR, read metadata from off-chain source via
    // CCIP-Read aka 'OffchainLookup()'. If partial metadata exists off-chain, return 
    // may include URL for that data's off-chain API (e.g. ENS off-chain resolvers rely
    // on GraphQL endpoints to fetch complete off-chain state for a node per ENSIP-16)
    return metadata | revert OffchainLookup(key);
}

// Generic function in a contract
function setValue(
    bytes32 key,
    bytes32 value
) external {
    // Defer write call to off-chain handler
    revert StorageHandledByBOB(
        msg.sender, 
        abi.encode(key, value), 
        contract.metadata.selector
    );
}

where, the following structure for StorageHandledByBOB() must be followed:

// Details of revert event
error StorageHandledByBOB(
    bytes msg.sender, // Sender of call
    bytes callData, // Payload to store
    bytes4 contract.metadata.selector // Function selector for metadata required by off-chain clients
);

Metadata

The metadata() function captures all the relevant information that the client may require to update a user’s data on their favourite storage. For instance, metadata() must return a pointer to a user’s data on their desired storage. In the case of StorageHandledByL2() for example, metadata() must return a chain identifier such as ChainId and additionally the contract address. In case of StorageHandledByOffChainDatabase(), metadata() must return the custom gateway URL serving a user’s data. In case of StorageHandledByIPNS(), metadata() may return the public key of a user’s IPNS container; the case of ArNS is similar. In addition, metadata() may further return security-driven information such as a delegated signer’s address who is tasked with signing the off-chain data; such signers and their approvals must also be returned for verification tasks to be performed by the client. It follows that each storage handler StorageHandledBy<>() must define the precise construction of metadata() function in their documentation. Note that the metadata() function doesn’t necessarily read any or all of the aforementioned metadata from the contract; it is possible that this metadata is in fact stored off-chain, in which case metadata() function may instead revert with OffchainLookup() that the client must process.

// Generic metadata function's construction
function metadata(
    bytes calldata node
)
    external
    view
    returns (metadata, string)
{
    (metadata onchainMetadata, string metaEndpoint) = getMetadata(node);
    return (
        metadata onchainMetadata, // Relevant on-chain metadata (optional)
        string metaEndpoint // Endpoint URL for metadata API (optional)
    ) | revert OffchainLookup(node); // If entire metadata exists off-chain
}

Some example constructions of metadata() functions which support L2, databases, IPFS, Arweave, IPNS, ArNS and Swarm[?] are given below.

L2 Handler: StorageHandledByL2()

A mimimal L2 handler only requires the list of ChainId values and the corresponding contract addresses and StorageHandledByL2() as defined in EIP-5559 is sufficient. In context of this proposal, ChainId and contract must be returned by the metadata() function. The deferral in this case will prompt the client to submit the transaction to the relevant L2 as returned by the metadata() function. One example of an L2 handler’s metadata() function is given below.

EXAMPLE

error StorageHandledByL2(..., contract.metadata.selector);

function metadata(bytes calldata node)
    external
    view
    returns (address, string memory, string memory)
{   
    (address contractL2, string chainId) = getMetadata(node);
    // contractL2 = "0x32f94e75cde5fa48b6469323742e6004d701409b"
    // chainId = "21"
    return (
        contractL2, // Contract address on L2
        chainId, // L2 ChainID
        metaEndpoint // Metadata API endpoint (optional)
    );
}

There may however arise a situation where a service first stores some data on L2 and then writes - asynchronously or otherwise - to another off-chain storage type. In such cases, the L2 contract should implement a second off-chain write deferral after making desired local state changes. This in principle allows creation of chained storage handlers without explicitly introducing a callback function in this proposal.

Database Handler: StorageHandledByDatabase()

A minimal database handler is similar to an L2 in the sense that:

a) it requires the gatewayUrl responsible for handling off-chain write operations (similar to ChainId), and

b) it should require eth_sign output to secure the data and the client must prompt the users for these signatures (similar to eth_call).

In this case, the metadata() must return the bespoke gatewayUrl and may additionally return the addresses of dataSigner of eth_sign. If a dataSigner is returned by the metadata, then the client must make sure that the signature forwarded to the gateway is signed by that dataSigner. One example of a database handler’s metadata() function is given below.

EXAMPLE

error StorageHandledByDatabase(..., contract.metadata.selector);

function metadata(bytes calldata node)
    external
    view
    returns (string memory, address, string memory)
{   
    (string gatewayUrl, address dataSigner) = getMetadata(node);
    // gatewayUrl = "https://api.namesys.xyz"
    // dataSigner = "0xc0ffee254729296a45a3885639AC7E10F9d54979"
    return (
        gatewayUrl, // Gateway URL
        dataSigner, // Ethereum signer's address
        metaEndpoint // Metadata API endpoint (optional)
    );
}

In the above example, the client must first verify that the eth_sign is signed by a matching dataSigner, then prompt the user for a signature and finally pass the resulting signature to the respective gateway URL. The message payload for the signature in this case may be formatted as per EIP-712, as detailed in EIP-5559. Some storage handlers may however choose simple string formatting as long as it is properly documented in their documentation. This proposal leaves this aspect of off-chain metadata construction to storage handlers and individual ecosystems.

Decentralised Storage Handlers

Decentralised storages are the extremest in the sense that they come both in immutable and mutable form; the immutable forms locate the data through immutable content identifiers (CIDs) while mutable forms utilise some sort of namespace which can statically reference any dynamic content. Examples of the former include raw content hosted on IPFS and Arweave while the latter forms use IPNS and ArNS namespaces respectively to reference the raw and dynamic content.

The case of immutable forms is similar to a database although these forms are not as useful in practise so far. This is due to the difficulty associated with posting the unique CID on chain each time a storage update is made. One way to bypass this difficulty is by storing the CID cheaply in an L2 contract; this method requires the client to update the data on both the decentralised storage as well as the L2 contract through two chained deferrals. CCIP-Read in this case is also expected to read from two storages to be able to fully handle a read call. Contrary to this tedious flow, namespaces can instead be used to statically fetch immutable CIDs. For example, instead of a direct reference to immutable CIDs, IPNS and ArNS public keys can instead be used to refer to IPFS and Arweave content respectively; this method doesn’t require dual deferrals by CCIP-Write (or CCIP-Read), and the IPNS or Arweave public key needs to be stored on chain only once. However, accessing the IPNS and ArNS content now requires that the client must prompt the user for additional information via context, e.g. IPNS and ArNS signatures in order to update the data.

Decentralised storage handlers’ metadata() interface is therefore expected to return additional context which the clients must interpret and evaluate before calling the gateway with the results. This feature is not supported by EIP-5559 and services using EIP-5559 are thus incapable of storing data on decentralised namespaced & mutable storages. One example of a decentralised storage handler’s metadata() function for IPNS is given below.

EXAMPLE: StorageHandledByIPNS()

error StorageHandledByIPNS(..., contract.metadata.selector);

function metadata(bytes calldata node)
    external
    view
    returns (string memory, address, bytes memory, string memory)
{   
    (string gatewayUrl, address dataSigner, bytes ipnsSigner) = getMetadata(node);
    // gatewayUrl = "https://ipns.namesys.xyz"
    // dataSigner = "0xc0ffee254729296a45a3885639AC7E10F9d54979"
    // ipnsSigner = "0xe50101720024080112203fd7e338b2de90159832ffcc434927da8bbfc3a000fa58ea0548aa8e08f7e10a"
    return (
        gatewayUrl, // Gateway URL
        dataSigner, // Ethereum signer's address
        ipnsSigner, // Context for namespace (IPNS signer's hex-encoded CID)
        metaEndpoint // Metadata API endpoint (optional)
    );
}

In this example, the client must process the context according to the specifications of the native StorageHandledBy<>() identifier. For instance, in the particular example shown above, the client must request the user for at least a sequence counter and an IPNS signature matching the signer’s CID returned in context. The clients should evaluate the context by feeding the sequence counter to the message payload and then obtaining the resulting IPNS signature. This signature must then be passed to the gateway among other arguments.

New Revert Events

  1. Each new storage handler must submit their StorageHandledBy<>() identifier through an ERC track proposal referencing the current draft and EIP-5559.

  2. Each StorageHandledBy<>() provider must be supported with detailed documentation of its structure and the necessary metadata() that its implementers must return.

  3. Each StorageHandledBy<>() proposal must define the precise formatting of any message payloads that require signatures and complete descriptions of custom cryptographic techniques implemented for additional security, accessibility or privacy.

Implementation featuring ENS

ENS off-chain resolvers capable of reading from and writing to decentralised storages are perhaps the most complex use-case for CCIP-Read and CCIP-Write. One example of such a (minimal) resolver is given below:

interface iResolver {
    // Defined in EIP-X
    error StorageHandledByIPNS(
        address sender,
        bytes callData,
        bytes4 contract.metadata.selector
    );
    // Defined in EIP-137
    function setAddr(bytes32 node, address addr) external;
}

// Defined in EIP-X
string public gatewayUrl = "https://api.namesys.xyz";
string public metaEndpoint = "https://gql.namesys.xyz";

/** 
* Metadata interface required by off-chain clients as defined in EIP-X & ENSIP-16
* @param node Namehash of ENS domain to fetch metadata for
* @return metadata Metadata required by off-chain clients. Clients must refer to
* ENSIP-Y for directions to process the returned metadata
*/
function metadata(bytes calldata node)
    external
    view
    returns (string memory, address, bytes memory, string memory)
{   
    // Get ethereum signer & IPNS CID stored on-chain with arbitrary logic/code
    address dataSigner = metadata[node].dataSigner; // Unique to each name
    bytes ipnsSigner = metadata[node].ipnsSigner; // Unique to each name or each owner address
    return (
        gatewayUrl, // Gateway URL tasked with writing to IPNS
        dataSigner, // Ethereum signer's address
        ipnsSigner, // IPNS signer's hex-encoded CID as context for namespace
        metaEndpoint // GraphQL metadata endpoint (required by ENSIP-16)
    );
}

/**
* Sets the ethereum address associated with an ENS node
* [!] May only be called by the owner or manager of that node in ENS registry
* @param node Namehash of ENS domain to update
* @param addr Ethereum address to set
*/
function setAddr(
    bytes32 node,
    address addr
) authorised(node) {
    // Defer to IPNS storage
    revert StorageHandledByIPNS(
        msg.sender,
        abi.encode(node, addr),
        iResolver.metadata.selector
    );
}

NOTES

Preliminary draft of ENSIP-Y quoted in the text is attached for reference.

2 Likes

I’m still reading your post but this made me think: should we have a developer/ABI convention for “this is a CCIP-Readpoint”?

Preliminary draft of ENSIP-Y quoted in the text is attached for reference.

Thanks for continuing to work on this!

This should be formulated as a PR to EIP-5559, rather than a standalone EIP.

EIP editors generally frown on standards being dependent on outside docs; this will likely need to be rephrased as allowing subsequent EIPs/ERCs to add new methods.

Can’t we just have the revert include whatever metadata is required? I don’t think we need a separate method for this.

Yes, we can. That’s easier.

Quickly recap as there are a lot of words here:

  • EIP-3686 CCIP Read
    • OffchainLookup()
  • EIP-5559 Cross Chain Write
    • StorageHandledByL2()
    • StorageHandledByOffChainDatabase()
  • NameSys: Draft 3 (this thread) / Draft 2 / Draft 1
    • StorageHandledByL2()
    • StorageHandledByDatabase()
    • StorageHandledByIPNS()
  • JustaName: Draft
  • EIP-129 Signed Data
  • EIP-721 Typed Signed Data

My understanding:

  1. Contract A: send this signed to here, then tell me
    A.setX() → revert → sign → POST → A.callbackX(response)

  2. Contract A: hey client, call this on chain C for me, (then tell me?)
    A.setX() → revert → chain(C).sendTx(to: B, request)A.callbackX()?

For case (1): I could see the need for a challenge, but that’s just doing it multiple times (if you need 2 signatures) or using OffchainLookup first (eg. to acquire a nonce):

Example: setX() → revert OffchainLookup → POST → callback1(response) → revert OffchainWrite → sign → POST → callback2(response)

OffchainWrite() (or whatever) is EIP-5559’s POST {sender, data, signature} but I think the arguments the exactly the same as OffchainLookup() just the client behavior is different: request a signature (+ cancel behavior) + include the signature in the fetch(). Technically, you could just reuse OffchainLookup() with a wrapper on the calldata that indicates “sign me”.

If we’re using EIP-712, we just need a standard for how to make the calldata, for example: setX(a,b,c) should expose typed (a, b, c) to the signer + any intermediates.

For case (2): this should be a separate proposal. Security issues if the transaction invokes prior-approved asset transfer. Probably the function that gets called L2 should be a fixed receiver like offchainWrite() so stuff like transfer() and setApproval() cannot be called at top-level.

I don’t understand how specific storage mechanisms that are invoked using case (1) via a fetchable() gateway are involved. There must be something I’m missing?

1 Like

Correct. Draft-1 was in fact completely inspired by CCIP-Read. CCIP-Read can be used to do CCIP-Write if the clients agree to accept that anomalistic/hacky use and bear with calldata getting fairly mutilated. Other than that, technically, you don’t need another protocol.

Draft-2 and Draft-3 however lack callback() and your two examples are something like:

  1. Contract A: send this signed to here, then tell me
    A.setX() → revert → sign → POST A.callbackX(response)
  2. Contract A: hey client, call this on chain C for me, (then tell me?)
    A.setX() → revert → chain(C). sendTx (to: B, request) A.callbackX()?

Callbacks were purged to maintain compatibility with 5559 and with the consideration that they are not that helpful to warrant such complexity in the protocol. Your thoughts on this? Similar potential use cases have come to our minds as well…

Agreed. @alextnetto.eth is part of this discussion and we think Blockful will come up with airtight standardisation of L2 part. NameSys is handling the decentralised storages part.

All of that for case (1) is here in a draft ENSIP-Y where storage mechanism for StorageHandledByIPFS() is described detail. It is currently drafted with ENS in mind but discussions with Nick suggest that contents of ENSIP-Y can be generalised and included herein. In the next iteration, I will attempt that and hopefully everything will be self-contained in a single document. In the meantime, if you don’t mind, refer to ENSIP-Y for gory details :sweat_smile:

P.S. The next iteration will also remove the metadata interface per Nick’s suggestion and pass the metadata directly in the revert. This will take away the ability to CCIP-Read before CCIP-Writing. There are ways to compensate for this though, e.g. with metaEndpoint in metadata replacing CCIP-Read.

I didn’t understand. Mind elaborating? bytes4 OffchainLookup() is the convention, no?

2 Likes

In EIP-3668, the calldata being used to call the gateway is generated by the first contract view call and returned in the revert as a parameter.

I suggest we follow the same approach on EIP-5559 since it gives more flexibility to different implementations and lets the L1 do any modification necessary to L2 writing.

That means changing StorageHandledByL2(chainId, contractAddress) to StorageHandledByL2(chainId, contractAddress, callData).

Would be great hearing your thoughts! @nick.eth @NameSys @raffy

I just meant that when reading an implementation, it might be useful for the developer/reader, to know that “this function should be called with CCIP-Read/Write enabled” and “this function normally CCIP-Read but I don’t revert OffchainLookup()”. Purely an annotation/comment thing, not specific to this ENSIP, just CCIP+Solidity in general.


I was assuming the callback would be a view operation.

I think null callback should indicate no confirmation is required. For example, ethers will still call the null selector.

Here is the example I was thinking of:

  • Sequence: setX(a) → revert OffchainLookup({a}) → POST → nonce: 1234callback1() → revert OffchainWrite({a, 1234)sign/sendTx → POST → ref: 0x123456callback2() → revert OffchainLookup({0x123456}) → POST → callback3()
  • Flow:
    • you call setX(a) with CCIP Read/Write enabled
    • throws for CCIP-Read to get a challenge for the write → like HTTP OPTION
    • on callback, throws for CCIP-Write with challenge + a → like HTTP POST
    • on callback, throws for CCIP-Read to provide the confirmation
    • on callback, does nothing (null, could be elided)
  • Scenerio:
    • you call setX() from app A → “write some data to protocol B on chain C” which is actually a different app
    • you do that and provide receipt to A that you did that

Ideally, app A would be monitoring events for chain C, but this would explicitly transmit the receipt to A.

Whereas, setX() to an offchain server might simply be: setX() → revert OffchainWritesign → POST → stop (callback selector is null)

In general, you can do a lot with an initial OffchainLookup prior to the write.

1 Like

This has always been the case in this proposal:

revert StorageHandledByL2(msg.sender, callData, [chainId, contract])

Only disagreements so far had been about how to pass the [chainId, contract]

Yes, that would be helpful. I will make space for some standard natSpec.

This is very similar to the 3-level nested CCIP-Write example in Draft-1 that I had initially proposed. In your example, you do some pre-flight stuff with CCIP-Read, then do CCIP-Write and then do some post-processing with CCIP-Read (this can instead be another CCIP-Write!). The argument by Nick in that case was that

  • any arbitrary pre-flight stuff can all be done by the storage provider and it can pass the resulting pre-flight info to the clients via the metaEndpoint provided in metadata,
  • any post-processing can be done by the gateway handling the write operations internally.

This makes the protocol arguably less complex. I think these arguments still hold but I hope @nick.eth can cross-check what I just said. I think a very explicit and realistic use case might help although I have failed to come up with one so far that cannot be achieved by the taking the above two steps.

This is really dangerous, though - it means that the contract can prompt the user to make any transaction to any contract on any chain under the pretence of updating the existing contract’s storage!

Right; what NameSys is saying is that we removed the callback functionality in order to reduce the complexity of the standard (bearing in mind that callbacks could trigger more CCIP-reads and OCWDP writes). With a compelling use-case for reintroducing them, we could do that.