JSON vs XML

David Priest May 7, 2019
When should you use JSON?

Pros and cons of JSON versus XML

When to use JSON?

One admittedly true criticism of XML is that it can contain a considerable amount of markup, making it less than friendly to the reader’s eye and adding unnecessary bulk to data transfers. JSON was developed in reaction to this clutter, with the goal of simplifying the parsing engine and reducing markup overhead. Thus, JSON has had great appeal to those programmers working in the restrictive web and device-programming environments.

There are 3 commonly discussed benefits of JSON over XML:

  1. In most scenarios, JSON is undoubtedly easier to read in its expanded form than XML.
  2. JSON can have a substantially lower character count reducing the overhead in data transfers.
  3. JSON is much easier to parse. But this is only relevant if one is writing a parser which is not a common activity at this point.

A word of caution when someone talks about these benefits:

  1. Both XML and JSON, while designed to be human readable, are intended to be used by machines.
  2. While the character cost savings with JSON can be significant, it is worth noting that
    1. TCP/IP over Ethernet typically transfers data using a packet size of around 1500 bytes: if the data packet is smaller, it is padded with nulls, obliterating the character cost savings; and
    2. data is often compressed, and the high redundancy of XML tag content after compression results in a greatly reduced byte savings between the two data structure forms.
  3. JSON may be much easier to parse, but again — this is only relevant if one is writing a parser.

Disadvantages of JSON compared to XML:

1. JSON was developed by a web developer who was frustrated by web browser limitations.

Its development to-date lacks the same standards that have emerged for XML technologies. As a result, there are now multiple, incompatible JSON versions of schemas, path expressions, query languages, and, recently, transformation tools. All of these extensions to JSON complicate an otherwise uncomplicated data structure format, rendering it more difficult to read, more difficult to parse, and essentially obliterating the advantages JSON was purported to have. XML, however, has product maturity. The XML specification has proven itself over the past two decades to be a robust and reliable standard, with robust and reliable toolsets, and a rich history of enterprise-class use.  

2. JSON isn’t as robust a data structure as XML is.

There is no ability to add comments or attribute tags to JSON, which limits your ability to annotate your data structures or add useful metadata. The lack of standardized schemas limits your ability to programmatically verify that your data is correct. It is worth noting here that one of the leading causes of application defects is bad data.

3. JSON is not well-suited to combining information sets from different systems.

While the namespace declarations available in XML allow the co-mingling of information from different schemas, avoiding naming collisions is problematic for JSON (for example, when CustomerID exists in two different systems but means different things in each). The author of a JSON data structure must, at this point, define a new data structure that can represent the information from each system, and must invent new unique names for each type of underlying data to avoid conflicts. This leaves XML as a much more suitable medium for enterprise data, where multiple back- and front-end systems are the norm.

4. JSON does not directly support the extension of base types like what is possible in XML.

XML can use various schema constructs to build new information sets out of existing ones, for example, adding restrictions to a core type, re-using a core type as part of a new type, and including whole schemas from external systems in new schemas that build on or enhance the underlying information.

5. JSON is not reviewed by a universal standards committee

JSON data transformation tools are in the realm of individual open-source projects, not a universal standards committee like W3C, which has a robust review and suitability qualification process. In most cases, JSON transformation capabilities are in their infancy, and support only the most basic of data conversions. XML, by contrast, which can be transformed with XSLT, has a significant history of success over decades with every variation of data transformation imaginable on projects around the planet.

6. There is a fundamental mismatch between JSON and HTML, the language of virtually all web sites.

HTML _is_ XML, especially HTML5, which by definition must be “well-formed” XML. XSLT is the ideal language for describing the conversion of semantic data to presentation data; in this case, all artifacts are XML, the data, the transform, and the resulting HTML. In contrast, JSON is typically consumed by JavaScript, which is then converted into HTML elements using DOM APIs, a process so non-intuitive that it has given rise to the mass adoption of helper libraries like JQuery and React. Further, JavaScript has no compile-time phase, so bugs are often not found until run-time (meaning, after your system has shipped or gone live and customers are the ones to first experience many types of bugs), and JavaScript itself can neither be proven to be correct like XSLT, nor transformed into alternate transformations, nor can it be easily and systematically generated by machines in a data-driven fashion like XSLT can.

I believe it is quite likely that JSON evolved due to the browser companies not including a standardized XSLT engine within their offerings. While many browsers originally support XSLT version 1.0, support for this was dropped by many, and XSLT is now at version 3.0, a much more usable implementation than the 1.0 standards that were developed before the turn of the century. It is worth mentioning however that Saxonica offers a full JavaScript-oriented XSLT 3.0-compliant implementation that can be deployed to all browsers as part of your web implementation, re-enabling the dream of client-side transformations of semantic data into presentation HTML using only XSLT.

It is also worth noting that, with XSLT 3.0, transformation of JSON data directly is now possible. Automatic conversions to and from XML and JSON, and in fact, even RESTful services to SOAP and back again, are now becoming standard features offered by integration platforms like the Avato Hybrid Integration Platform and Google Apigee.

CONCLUSION

To be sure, JSON is not a short-lived fad: It is a great choice when transferring small amounts of data that is short-lived, not complex, and verified correctness is not a concern. Its lack of type definitions are convenient when working with JavaScript, which shares JSON’s propensity for viewing everything as either a string or a number or a boolean. While eliminating data types may be convenient for new developers and quick-and-dirty projects, seasoned enterprise engineers appreciate the design-time type safety built into the XML family of markup, the ability to share information sets from different applications and authors, the ability to reject invalid data before it causes a defect some day further down the road when the cause will be elusive and expensive to identify and correct, and the ability to use a single programming language for information transportation, transformation and presentation.

Avato

Why Schemas?

December 21, 2018
READ MORE
Avato

Why Choose XPath?

January 8, 2019
READ MORE