XSBC: XML Schema-based Binary Compression

    What is XSBC?     License     How to Compile     How to Use     Comparison Tool     Example Programs     Simple Example     SenderSimulation     SAX Stream     FEC     AUV Workbench     Binary Delivery     Future Work     Algorithm Documentation    

What is XSBC?

XSBC is a library designed to compress XML documents and messages. It is designed to support both large documents like X3D and SVG files, as well as short messages such as SOAP and XML-RPC. A major feature of this library is the ability to register compressors for an attribute type, and element or document fragment. This allows data-aware compressor algorithms to get much better compression then typical generic routines. A Forward Error Correction (FEC) library has been added to enhance transmission quality of serialized files within noisy links.

This library has not reached 1.0 status. Until that time we will not maintain backwards compatibility between releases. We expect to make some more modifications to the core format before its final release. An approximate timeframe for the 1.0 release is Summer 2005.

License

XSBC is licensed under the GNU LGPL v2.1. Please read http://www.gnu.org/copyleft/lgpl.html for more information.

FEC 1.0.3 is licensed under a BSD-style. Please read http://www.opensource.org/licenses/bsd-license.php for more information.

How to Compile

The most up to date source code for XSBC is located on the Extensible Modeling and Simulation Framework (XMSF) project view page. To anonymously download from the SourceForge.net CVS Repository use the instructions located here. The binary containing source code for the Java FEC Library v1.0.3 is here, however is not required for building XSBC as an FEC binary is already present in the library directories.

To compile the codebase you will need at least a Java 2 Standard Edition (J2SE) Development Kit (J2SDK) of 1.4, or better, and the latest Ant. Once you have Java, plus Ant, installed and added to your path then you can build XSBC. Open a console, go to the directory you installed the source and type:

   ant

This will compile the codebase, create a jar, write the runapps*.bat/runapps*.sh files and generate the documentation.

The * is a place marker for either 1.4 or 1.5 indicating the JDK you prefer.

How to Use

There are two primary ways to use XSBC. The first is as a standalone tool for compressing and decompressing XML files. This is where you should start if you are evaluating XSBC for your application. We've provided a Comparison Tool to show the compression rates and parsing speeds achievable using XSBC.

One common mistake using XSBC is processing XML documents which do not have a schema reference. The root element must contain a readable schema reference. Multi-namespace documents have not been implemented yet. Document Type Definitions (DTDs) without a corresponding XML schema are not supported, because, DTDs do not contain sufficient datatype and structure information for XSBC to perform effective compression.

<X3D profile="Immersive" xmlns:xsd="http://www.w3.org/2001/XMLSchema-instance"
        xsd:noNamespaceSchemaLocation="http://www.web3d.org/specifications/x3d-3.0.xsd">

   <!-- document -->
</X3D>

Comparison Tool

The Comparison Tool is a standalone application designed to allow comparison of different compression methods to see how XSBC performs along with other available compression algorithms. It's also an easy way for you to evaluate XSBC before adding it to your application.

Build XSBC as instructed in the How to Compile section.

To run the comparison tool execute either the:

      runapps*.bat (for Win) or runapps*.sh (for Unix)

files in the build directory. They are set by default to run the ComparisonTool from a MANIFEST-only jar file in the /lib1.4 directory that contains the main class:

   org.web3d.xmsf.xsbc.apps.comparison.ComparisonTool

Open an XML file to be processed using the File / Open menu. Select which stages you'd like performed, the compression options and then hit the Process button. This will compress the XML file and report the resultant file sizes. Pressing the View button on a particular item will launch an external viewer to view the file.

The Compression Methods block is where you can decide how to compress the file.

The Fastest parsing option will binarily encode the file with no compression. This will result in some decrease in filesize (about 30-50%) and a significant increase in parsing speed.

The Compression, Non Lossy method will use non lossy techniques to compress the file.

The Compression, Lossy method will use lossy techniques to compress the file. These typically center around quantizing floats. Currently this is hardcoded to using 16 bits instead of the normal 32 bits to store floats. The next version of XSBC will allow you to specify the parameters used for each attribute/element.

Another similar tool is provided in the Xj3D toolkit (www.xj3d.org) which shows how XSBC can be used to compress X3D files.

Example Programs

The second option is integrating XSBC in your application. There are a few ways you might use XSBC. The simplest way is to use XSBC as a wrapper for your XML documents. You compress with XSBC, deliver the XML file then decompress to disk or memory, then process the file normally as an XML document. This method will decrease transmission time, but not parsing time.

Simple Example

This example program shows how XSBC can used to compress a document, decompress it and write it to disk. To run this example edit the launcher* target's Main-Class value attribute in the build.xml file to execute:

   org.web3d.xmsf.xsbc.apps.SimpleExample

type:

   ant launcher*

from a console and then execute the runapps*.* file for your system.

SenderSimulation

This example shows a simple IP implementation of XSBC with an option for FEC (for enhancing transmission reliability within noisy links). Shows how large (2MB+) *.xsbc or *.xsbc.gz files can be transmitted reliably. To run this example edit the launcher* target's Main-Class value attribute in the build.xml file to execute:

   org.web3d.xmsf.xsbc.apps.SenderSimulation

type:

   ant launcher*

from a console and then execute the runapps*.* file for your system.

Follow these steps to run SenderSimulation for each of the following circumstances:

A. XSBC only (no gzip)
   1) On the SenderSimulation, select "File"
   2) Unselect "GZip File"
   3) Select "File" again, then "Load XML File", then "examples"
      (You should see a DOM tree representation of the XML file you select)
   4) On the Receiver Simulation, select "Receive XSBC", then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation)
   5) On SenderSimulation, select "Send"
      (You should see a DOM tree representation in the ReceiverSimulation)

B. XSBC (w/ gzip)
   1) On the SenderSimulation, select "File", then "Load XML File", then 
      "examples" 
      (You should see a DOM tree representation of the XML file you select)
   2) On the Receiver Simulation, select "Receive XSBC", then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation)
   3) On SenderSimulation, select "Send"
      (You should see a DOM tree representation in the ReceiverSimulation)

C. XSBC (w/ gzip and FEC)
   1) On the SenderSimulation, select "File"
   2) Select "Encode with FEC"
   3) Select "File" again, then "Load XML File", then "examples"
      (You should see a DOM tree representation of the XML file you select)
   4) On the Receiver Simulation, select "Receive Encoded", then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation)
   5) On SenderSimulation, select "Send" 
      (You should see DOM tree representation in the ReceiverSimulation)

D. XSBC (w/o gzip, but w/ FEC)
   1) On the SenderSimulation, select "File"
   2) Select "Encode with FEC"
   3) Select "File" again, then unselect "GZip File"
   4) Select "File", then "Load XML File", then "examples"
      (You should see a DOM tree representation of the XML file you select)
   5) On the Receiver Simulation, select "Receive Encoded", then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation)
   6) On SenderSimulation, select "Send"
      (You should see DOM tree representation in the ReceiverSimulation)

E. In any of the above instances, selecting the "Send" button a second time will
   produce an xml file named "resultsXsbc0.xml" in the path /dataweb/results

F. Save a file in .xsbc form and retrieve manually with ReceiverSimulation
   1) On the SenderSimulation, select "File"
   2) Unselect "GZip File"
   3) Select "File" again, then "Load XML File", then "examples"
      (You should see a DOM tree representation of the XML file you select)
   4) Select "Save"
   5) On the Receiver Simulation, select "File", then "Load Binary File" 
      then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation, then select the corresponding *.xsbc file.  You should 
       see a DOM tree representation in the ReceiverSimulation)

G. Save a file in .xsbc.gz form and retrieve manually with ReceiverSimulation
   1) On the SenderSimulation, select "File"
   2) Select "File", then "Load XML File", then "examples"
      (You should see a DOM tree representation of the XML file you select)
   3) Select "Save"
   4) On the Receiver Simulation, select "File", then "Load Binary File",
      then "examples"
      (Assign the corresponding schema for the XML file you selected in the
       SenderSimulation, then select the corresponding *.xsbc.gz file.  You 
       should a see DOM tree representation in the ReceiverSimulation)

SAX Stream

This example shows how you would use XSBC to interject events into a SAX stream. This would allow you to use XSBC to compress the XML document, deliver it to the client and then use the stream without an intermediate step. This process still converts the binary data to strings so it does not decrease the parsing time (look at the binary delivery example for the fastest usage). To run this example edit the launcher* target's Main-Class value attribute in the build.xml file to execute:

   org.web3d.xmsf.xsbc.apps.SAXExample

type:

   ant launcher*

from a console and then execute the runapps*.* file for your system.

FEC

A host of example files on how to implement FEC are contained the in FecTestFiles source package. To run any of the example files edit and build the runscripts* target for the launcher*.jar Main-Class you desire to run as in the above examples for XSBC.

AUV Workbench

Another example program under development that integrates XSBC and FEC is the NPS Autonomous Un-manned Vehicle (AUV) Workbench (AUVW).

Binary Delivery

In order to improve both transmission and parsing time you will need to handle XSBC parsing events in a similar fashion to SAX calls. The difference is the SAX like calls will include binary data instead of just strings. This means the original data does not go through a string to data conversion before use in your application.

We need to develop a simple example to show this usage. Currently the best example is the ComparisonTool done for X3D binary process. You can find this code in Xj3D codebase under the contribs/xsbc area. Specifically look at the X3DElementReader class in contribs/xsbc/src/org/web3d/xmsf/xsbc/x3d/X3DElementReader.java. It implements the ElementReader interface from XSBC. This interface is similar to the SAX ContentHandler interface except that it delivers binary data instead of just strings.

Future Work

XSBC has been offered as an royalty-free exemplar algorithm for the following efforts:

Once XSBC reaches version 1.0 functionality and performance, we expect to add a variety of functional improvements, including the following:

Algorithm Documentation

You can find a paper on the algorithm design for XSBC at: NPS XSBC Documentation and a paper on the algorithm design for FEC at: NPS FEC Documentation

Please note that XSBC used to be named XFSP and is named such in the XSBC paper.