JSON is the Fluid of the Internet

Standards always seem to move at glacial speed, so slow it is hard to see any change, but looking back over the past decade or so, we can see the JSON has won the internet format battle, and XML is a thing of the past.

Standards and Protocols

It took 25 years, but Unicode is now the single dominant standard for character encoding.  Back in the 90’s it was pretty clear this was going to be the case, but there was incredible momentum from the other entrenched character sets.  This was particularly apparent at the Japanese company I was working at, where Shift-JIS was the dominant encoding, among several, and nobody dared hint that it was going to go away.   Customer should have what they want, was the excuse.   “We can translate between character sets as needed.”  But it was clear that you could not store Shift-JIS in a database together with Hebrew or Arabic or Hindi or any other language than Japanese.  Unicode solved this problem, but it took 25 years to finally make Shift-JIS disappear.

Other standards dies similarly slowly:  X-400 was an email standard which died in preference to the easier to implement SMTP.   I remember arguing for implementing a distributed product exclusively on TCP/IP while coworkers argued that we should also support LU6.2.  Once it becomes clear to me that a particular standard is inferior to another that is gaining traction, the outdated standard simply don’t die fast enough for me.

JSON

JSON is JavaScript Object Notation and if you don’t already know what it is, it probably does not matter.  Anyone designing Internet protocols today is already familiar with it so I won’t copy that here, but only to say it has a couple of very clear technical advantages over XML.   XML was a clear benefit over the earlier SGML, however both of these were defined as “markup languages” which means essentially that these tags are used within text to indicate different styles of text.  For example marking the text as bold or italic, and the tags can be nested allowing text to be both bold and italic.

Early work with XML was used for transferring data and we called that a web service.  I was coauthor on the first paper about web services;  it was a proposal to the IETF called SWAP.[1]  I was also a coauthor on the first magazine article about web services which was published in IEEE Internet Computing in 2000. [2]  We used XML because it had just been invented, it was machine independent, language independent, and available.   But it had some flaws when representing empty arrays and white space characters. XML by definition treats carriage return characters as exactly equivalent to space and tab characters.  White space used in data is mixed freely with white space added simply for formatting the representation.  It is fine for marking text with styles, but when simply carrying data from system to system it had some ambiguity and unreasonable overhead.  Some discussion of these is available in [3].

JSON solves these problems and provides a machine independent, language independent format that reliably transmits data without any loss from system to system.

Dominance

It is worth reflecting on just how ubiquitous JSON has become

  • Built into JavaScript.  It is extremely natural to use JS to parse, manipulate, and construct JSON because of course the format was originally designed from JS.
  • JavaScript is the single dominant language for programming in a browser.  All browsers support it.  The document object model that defines a browser translates naturally to JSON.
  • Angular, React, and other popular UI frameworks work on JavaScript natively.  The new Angular is programmed in TypeScript which is translated to JavaScript for running.
  • Browsers also support XML, however there is some ambiguity in how to handle white space, arrays and maps don’t translate directly into a usable form so the code is much harder.
  • Node.js is becoming an increasingly central part of modern technology stacks, and it natively consumes and produces JSON
  • Mongo DB, NosDB, Couchbase, and ElasticSearch store JSON documents directly and allow querying of the data within the documents
  • GraphQL is a hot up and coming REST API for relational databases that passes all data as JSON.

Tipping Point

While XML has to still be supported for legacy applications, I would suggest all new implementations just work with JSON.   It is possible to support both, however that is a pointless attempt to forestall the inevitable.  Every programming language that can handle XML can handle JSON.  (Except maybe XSLT but that is a special case.)  Of course, XML will still be around 20 years from now, just like COBOL is still n use today, but if you want to up to date with current technology stacks, JSON is the format that will make mixing easy.

We really are at the tipping point:  Data flows through modern processing systems in JSON, and JSON is the fluid of the Internet.

 

 

 

References

[1] Keith D Swenson, Simple Workflow Access Protocol, original submission to IETF, August 1998

[2] James G.  Hayes, Effat  Peyrovian, Sunil  Sarin, Marc-Thomas  Schmidt, Keith D.  Swenson, Rainer  Weber, Workflow Interoperability Standards for the Internet, IEEE Internet Computing, May/June 2000 (Vol. 4, No. 3), pp. 37-45

[3] Michaelzur Muehlen, Jeffrey V.Nickerson, Keith D.Swenson, Developing web services choreography standards—the case of REST vs. SOAP,

This entry was posted in Software and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s