Data Formats

To follow up on the Wire Protocol post, the next logical step in our SOA exploration is Data Formats.  A web service message’s Data Format describes how the data is organized in the message.

The most common Data Formats are

In fact,  these three probably cover 80-90% of all of the service messages that I’ve ever seen.  Note, SOAP is a particular XML format, but it is still XML.  You can also just pass an XML message of your own format via an HTTP POST request-perhaps, we can consider this the original Web ServiceSmile.  JSON is generally used with RESTful calls.

After these three, there are any number of proprietary or custom message formats that are in circulation.  For ThinkMiddleware.com purposes, I’m going to stick to spec-based approaches.

Even in the case of custom or proprietary message formats, it is usually XML.  When it isn’t, you wander into the realm of binary message formats.  These are usually older technologies, but not always.  There are numerous industry specs that define binary message formats-especially in the financial industry.  For binary message formats, one will see tools such as IBM WTX (WebSphere Transformation Extender) or Host Bridge used.

Another very popular data format in shops with a lot of history is Cobol Copybooks.  Efficiently converting between XML & Cobol Copybooks in order to exercise legacy Cobol programs running in CICS.  Once you can convert the data into XML, it is relatively simple to put a SOAP wrapper around.  WebSphere DataPower XI50s (and XI52s more recently) are capable of taking a WTX map and converting XML to Cobol Copybook data structures and vice versa.  During my career, I have supported environments that have done this very thing using BEA Tuxedo + ELink (Oracle products these days, I suppose), custom software, DataPower + WTX, and Host Bridge-again, it’s a common usage pattern.

Stick to technologies that have an industry spec backing them up whenever possible.

As identified in the ESB post and XML Gateway post, a major function of these patterns is Data Transformations.  A book could be written on this subject, but I will leave you with this piece of advice.  Anything that involves manipulating bits is going to be extremely CPU (and potentially memory) intensive.  This is true whether being done in software (JVM) or hardware (SOA Appliance)-plan accordingly.