IPIP-0328: JSON and CBOR Response Formats on HTTP Gateways

Related Issues
ipfs/in-web-browsers/issues/182
ipfs/specs/pull/328
ipfs/kubo/issues/8823
ipfs/kubo/pull/9335
ipfs/kubo/issues/7552
History
Commit History
Feedback
GitHub ipfs/specs (inspect source, open issue)

1. Summary

Add support for the DAG-JSON, DAG-CBOR, JSON and CBOR response formats in the [path-gateway].

2. Motivation

Currently, the gateway supports requesting data in the DAG-PB, RAW, CAR and TAR formats. In addition, it allows for traversing of links encoded through CBOR Tag 42, as long as they are intermediate links, and not the final document. It works on both DAG-CBOR, and its JSON representation, DAG-JSON. However, it should be possible to download deserialized versions of the final JSON/CBOR document in raw format (not wrapped in UnixFS).

The main functional gap in the IPFS ecosystem is the lack of support for non-UnixFS DAGs on HTTP gateways. Users are able to create custom DAGs based on traversable DAG-CBOR thanks to CBOR tag 42 being reserved for CIDs and DAG-JSON documents, but they are unable to load deserialized documents from a local gateway, which is severely decreasing the utility of non-UnixFS DAGs.

Adding JSON and CBOR response types will also benefit UnixFS. DAG-PB has a logical format which makes it possible to represent a DAG-PB directory as a DAG-JSON document. This means that, if we support DAG-JSON in the gateway, then we would support JSON responses for directory listings, which has been requested by our users in the past.

In addition, this functionality is already present on the current Kubo CLI. By bringing it to the gateways, we provide users with more power when it comes to storing and fetching CBOR and JSON in IPFS.

3. Detailed design

The solution is to allow the Gateway to support serializing data as DAG-JSON, DAG-CBOR, JSON and CBOR by requesting them using either the Accept HTTP header or the format URL query. In addition, if the resolved CID is of one of the aforementioned types, the gateway should be able to resolve them instead of failing with node type unknown.

4. Test fixtures

5. Design rationale

The current gateway already supports different response formats via the Accept HTTP header and the format URL query. This IPIP proposes adding JSON and CBOR formats to that list.

In addition, the current gateway already supports traversing through DAG-CBOR and DAG-JSON links if they are intermediary documents. With this IPIP, we aim to be able to download the DAG-CBOR, DAG-JSON, JSON and CBOR documents themselves, with correct Content-Type headers.

5.1 User benefit

The user benefits from this change as they will now be able to retrieve content encoded in the traversable DAG-JSON and DAG-CBOR formats. This is something that has been requested before.

In addition, both UX and DX are significantly improved, since every UnixFS directory can now be inspected in a regular web browser via ?format=json. This can remove the need for parsing HTML with directory listing.

5.2 Compatibility

This IPIP adds new response types and does not modify existing ones, making it a backwards-compatible change.

5.3 Security

Serializers and deserializers for the JSON and CBOR must follow the security considerations of the original specifications, found in:

DAG-JSON and DAG-CBOR follow the same security considerations as JSON and CBOR. Note that DAG-JSON and DAG-CBOR are stricter subsets of JSON and CBOR, respectively. Therefore they must follow their specification and error if the payload is not strict enough:

5.4 Alternatives

5.4.1 Why four content types?

If we do not introduce DAG-JSON, DAG-CBOR, JSON and CBOR response formats in the gateway, the usage of IPFS is constricted to files and directories represented by UnixFS (DAG-PB) codec. Therefore, if a user wants to store JSON and/or CBOR in IPFS, they have to wrap it as a UnixFS file in order to be able to fetch it through the gateway. That adds size and processing overhead.

In addition, we could introduce only DAG-JSON and DAG-CBOR. However, not supporting the generic variants, JSON and CBOR, would lead to poor UX. The ability to retrieve DAG-JSON as application/json is an important step for the interoperability of the HTTP Gateway with web browsers and other tools that expect specific Content Types. Namely, Content-Type: application/json with Content-Disposition: inline allows for JSON preview to be rendered in a web browser and webdev tools.

5.4.2 Why JSON/CBOR pathing is limited to full blocks?

Finally, we considered supporting pathing within both DAG and non-DAG variants of the JSON and CBOR codecs. Pathing within these documents could lead to responses with extracts from the document. For example, if we have the document:

{
  "link" {
    "to": {
      "some": {
        "cid2": <cbor tag 42 pointing at different CID>
       }
    }
  }
}

With CID bafy, and we navigate to /ipfs/bafy/link/to, we would be able to retrieve an extract from the document.

{
  "some": {
    "cid2": <cbor tag 42 pointing at different CID>
    }
}

However, supporting this raises questions whose answers are not clearly defined or agreed upon yet. Right now, pathing is only supported over CID-based Links, such as Tag 42 in CBOR. In addition, some HTTP headers regarding caching are based on the CID, and adding extraction pathings would not be clear. Giving users the possibility to retrieve JSON, CBOR, DAG-JSON AND DAG-CBOR documents through the gateway is, in itself, a progress and will open the doors for new tools and explorations.

A. References

[path-gateway]
Path Gateway Specification. Marcin Rataj; Adrian Lanzafame; Vasco Santos; Oli Evans; Henrique Dias. 2024-04-17. URL: https://specs.ipfs.tech/http-gateways/path-gateway/

B. Acknowledgments

We gratefully acknowledge the following individuals for their valuable contributions, ranging from minor suggestions to major insights, which have shaped and improved this specification.

Editors
Henrique Dias GitHub
Marcin Rataj GitHub
Gus Eggert GitHub