November 22, 2019

XML versus JSONkey quote form a great paper

Reproducing here a key point from a good paper, Kleppmann and Beresford’s A Conflict-Free Replicated JSON Datatype, that I find myself mentioning to others from time to time:

… Besides the superficial syntactical differences, the tree structure of XML and JSON appears quite similar. However, there is an important difference that we should highlight.

JSON has two collection constructs that can be arbitrarily nested: maps for unordered key-value pairs, and lists for ordered sequences. In XML, the children of an element form an ordered sequence, while the attributes of an element are unordered key-value pairs. However, XML does not allow nested elements inside attributes—the value of an attribute can only be a primitive datatype. Thus, XML supports maps within lists, but not lists within maps. In this regard, XML is less expressive than JSON….

Some applications may attach map-like semantics to the children of an XML document, for example by interpreting the child element name as key. However, this key-value structure is not part of XML itself, and would not be enforced by existing collaborative editing algorithms for XML. If multiple children with the same key are concurrently created, existing algorithms would create duplicate children with the same key rather than merging them….

This is important to keep in mind when assessing and especially translating between serializations, like TOML, which has taken Go and Rust by storm. And it’s important to keep in mind when writing new serializations, as I did for LAMOS.

It also affects various algorithms that work on the data encoded. I ran into that implementing crgmw-diff, which I’m still looking for time to turn into a full-featured JSON document-diff implementation.

Your thoughts and feedback are always welcome by e-mail.