The application of XSL for XML transformations in e-business solutions
Mark Colan
Find


Abstract
The Extensible Markup Language (“XML”) sets a standard for the exchange of business data that is completely platform- and vendor-neutral. XML is in increasingly wide use for web applications, especially for business-to-business integration.
Because XML data comes in many forms, the most important technology needed for XML applications is the ability to transform it from one for to another (“vocabulary translation”), and to convert it to visible renderings, for example in HTML or PDF documents.
The Extensible Stylesheet Language (“XSL”) specification, from the W3C standards body, defines a powerful language to easily transform XML data from one form to another. This paper introduces XSL and studies several application scenarios that benefit from the use of XSL to solve real-world e-business problems.

Contents
  1. The tower of babel problem
  2. The solution: XSL transformations
  3. XSL application scenarios: rendering XML as HTML
  4. XSL application scenarios: transcoding
  5. XSL application scenarios: application integration
  6. XSL application scenarios: business integration
  7. XSL application scenarios: portals
  8. XSL application scenarios: code generation
  9. Limits of mechanical translation
  10. Conclusions

The tower of babel problem
The Extensible Markup Language standard (“XML”) is now two years old, and a lot of progress has been made since the W3C “recommended” the XML specification. XML.ORG, a registry and repository for XML vocabularies overseen by OASIS, now has well over a hundred standard vocabularies for industry-specific usage. With the ebXML initiative, standards common to all industries — purchase orders, and the like — will begin to emerge. Still, compared to the enormous potential of using XML for web-based applications, these are still the early days.
Some might fear that a large number of vocabularies represent a fragmentation in the standard. To the contrary, XML is intended as a meta-language for establishing these vocabularies.
XML differs from HTML in that it describes the data, but not its presentation. While XML can easily be understood by programmers and programs, we need to be able display the data on web pages and page-oriented documents. To maximize the flexibility of using this data, the presentation should be specified outside of the XML document, for example using stylesheets to define its appearance.
We recognize that the unique business structures that give each company its own competitive edge can be represented in private vocabularies. Companies can organize their departments the same as individual enterprises, again with vocabularies that reflect their way of doing business. But ultimately information in the private definitions must be converted to a public standard for exchange with other organizations.
We also expect that new versions of vocabularies, even with completely different structures, are bound to replace the old as we learn better ways to do business.
All of this points to a need for automatic conversion from one form of XML to another, from XML to HTML, and from XML to completely different presentation formats, such as PDF. What we need, then, is a general way to accomplish mechanical translations from XML to all of these different forms.
Previous Previous Table of Contents
The solution: XSL transformations
The Extensible Stylesheet Language specification, known as XSL, describes powerful tools to accomplish the required transformation of XML data. XSL consists of the XSLT language for transformation, and Formatting Objects (“FO’s”), a vocabulary for describing the layout of documents. XSLT uses XPath, a separate specification that describes a means of addressing XML documents and defining simple queries. The XSLT and XPath 1.0 specifications are complete, having become W3C “Recommendations” on November 16, 1999 (see http://www.w3.org/Style/XSL/). The XSL specification (which also describes Formatting Objects) is expected to reach W3C recommended status soon.
There are now several implementations of processors for XSLT. In particular, the Xalan project from Apache Software Foundation (see http://xml.apache.org) is a robust and highly-compliant XSLT and XPath implementation. This tool was donated to Apache by IBM; it was developed at Lotus Development Corporation, an IBM company, by Scott Boag and his team. While Boag’s team continues to develop Xalan, being part of Apache means Xalan will enjoy contributions from individuals and other companies in the industry. With the XSLT specification in place and with the release of Xalan 1.0 in March 2000, XSLT is now stable and ready for real-world use.
The XSLT language offers powerful means of transforming XML documents into other forms, producing XML, HTML, and other formats. It is capable of sorting, selecting, numbering, and many other features. It operates by reading a stylesheet, which consists of one or more templates, then matching the templates as it visits the nodes of the XML document. The templates can be based on names and patterns. Templates include literal text that is used in the resulting transform, interspersed with directives to include specific data. This technique thus defines transformations are declared “by example”, a simple programming model.
XSLT is not a general-purpose programming language in the sense of Java or C++. For example, symbolic “variables” cannot be reassigned a new value, so they are really constant definitions. This limitation means that counters and accumulators are not available. Java-like “for” or “while” statements are also not available; instead, iteration can be accomplished using recursion.
The limitations in the language definition are intended to support powerful optimization techniques. XSLT supports an escape hatch to allow processing more easily done in conventional languages: the extension mechanism can call modules developed, for example, in Java.
The most important feature of XSL is the ability to develop transformations quickly, with few lines of code. A transformation that could be developed and tested in an hour might take days to write using Java, even when an off-the-shelf XML parser like Xerces (again, contributed to Apache by IBM) is used. One could write transformations in Perl, using XML4P to add XML parsing and DOM access support, but for many transformations it would be faster to use XSL.
XSL is a very new technology, and as an industry we have only begun to invent various uses for it. In the following sections, we see some of the ways in which it is used in these days of its infancy. These are not intended as design patterns or definitive approaches, but rather as examples of the many ways in which it may be employed. The purpose in studying these approaches is to stimulate your thinking for solving problems with XSL in ways that we have not yet invented.
Previous Previous Table of Contents
XSL application scenarios: rendering XML as HTML
XSL was originally developed with conversion of XML to HTML in mind, hence “Stylesheet” is its middle name. In this role, XSL can be run on the client, using a stylesheet either local to the client’s system, or one stored as a resource on a server. Using XSL on the client allows processing to be distributed to each client’s computer.
Most corporations find it more convenient to offload processing from client workstations to server computers. This simplifies the task of upgrading the power of an entire system; if more power is needed, the server can be upgraded or supplemented with other servers, for scalability. An important advantage to the use of servers is that applications can be upgraded in only one place, on the server, rather than requiring a redeployment of application software on many client machines.
XSL works well on a server. A common way to provide access is to use Servlets which respond to a client’s request by starting XSL and returning the resulting stream.
One can even imagine an architecture where it is used both on a server and on the client. For example, the server might select records that match a query, and “prune” parts of the tree that contain information not needed by the client to reduce transmission latency. The client could then run XSL locally to format the XML data according to the appearance they want for viewing.
Recent studies have concluded that browsing will become a small part of e-business in the coming years. They suggest that even though there are uncountable web sites today, a larger use of the internet — some say ten times as much — will be in the exchange of information in XML from one server to another, in scenarios that do not include a browser. Thus, business-to-business frequenly involves vocabulary translation, translating from one XML application to another, rather than transformation of XML to HTML.
Previous Previous Table of Contents
XSL application scenarios: transcoding
Translation on demand, whether to HTML, XML, or some other form, is recognized as a common use case on an application server. A transcoding server can be as simple as a servlet that can accept requests for a specified document rendered for a specific device or XML vocabulary, and run XSL to produce and return the result.
IBM has recently introduced a product called theWebSphere Transcoding Publisher http://www.software.ibm.com/developer/features/feat-transcoding.html which automatically provides XSL translation on demand. It is capable of rendering XML to several different forms. As such, it is the logical extension of the server XSL transformation model discussed in the previous section.
Transcoding can be used to create HTML renderings, or PDF (via XSL Formatting Objects and a formatting objects processor like James Tauber’s FOP on Apache), thus supporting conventional desktop and laptop clients, as discussed in the previous section.
It can also reformat the data to WML and other forms suitable for handheld devices. Doing so often requires pruning the data to a simpler form, as well as adapting it to the device requirements for handhelds. In the “Copernicus” project, IBM used transcoding technology to build a system with SABRE’s travel management system coupled to Nokia intelligent telephones. Information from SABRE is transcoded to an appropriate form for the device and sent to the device, at which point the mobile user can make changes to his itinerary as required, using HTTPS to talk to specialized business objects on the server. The flexibility of the transcoding technology allows the system to expand to support many other types of handheld devices, even when they involve vocabularies other than WML.
Finally, aside from converting XML to devices for direct client use, the Transcoding Publisher can be used for automated vocabulary translation, such as may be required for business-to-business transactions.
The major advantage of the transcoding server model is that it can start with support for a few devices, then add stylesheets to support others as the need arises. In addition to applications listed above, it could be used to support traditional print media — newspapers, magazines, books — as well as web publishing, or even the new e-books offline readers. It could support a fax-on-demand system. Cars will eventually be able to connect to the network, and transcoding can be set up to send information in the form they require. As set-top boxes integrate our home entertainment system with the home computer, transcoding will play a role.
IBM’s Transcoding Publisher runs the XSL processor from a servlet used to handle requests. It also supports caching of transformed data, so that multiple requests for the same transformation do not require running XSL for each request.
Previous Previous Table of Contents
XSL application scenarios: application integration
XML is being embraced by every major software vendor. The ability to emit XML, and to incorporate data expressed in XML, is being added to most software products where it makes sense.
Because XML is a common and portable data format that is, or will be, available in these products, there is a tremendous opportunity to use XML data to integrate software into a complete system. However, because the XML data may be in a variety of vocabularies, we may need a quick and mechanical means of converting it from the form we got it into the one we need.
We can imagine that a company’s internal structure might evolve into a series of entities with well-defined interfaces, and XML vocabularies that reflect their function. In this sense, the company’s structure begins to resemble the structure of business-to-business between companies, on a smaller scale.
In the diagram shown below, XML is the exchange medium between departments of a company, and XSL is used to transform data from the private form favored by a department into a form needed for processing in another.
The same model can be applied to the exchange of information between companies.
Previous Previous Table of Contents
XSL application scenarios: business integration
We are seeing a new trend in developing companies where one company specializes in one aspect of a complete business cycle. Such companies optimize their processes to be cost-effective. Since on their own they may not be able to provide certain products or services, they may seek complimentary products or services from other small companies, together offering the complete product or service required by their consumers. This arrangement ranges might be one-time partnership, or may exist in the long term. For all intents and purposes, a “soft merger” of this type begins to look like a “virtual company”. Indeed, the virtual company may have a name different from the partners involved in creating the service or product.In the new economy, this kind of business aggregation requires the ability to respond quickly to a new opportunity. When companies expose their services and products as processes represented in XML, it is possible to use XSL with not much programming to assemble an operating e-business from the partners’ component systems. Such companies can be described as “integration-ready”. Prior to the standardization of XML and XSL, building virtual companies from partners could take a long time — days, weeks, months — to configure middleware to work together, to write the required business logic. While XML and XSL does not eliminate these requirements, it does provide a quickly implemented and efficient means of aggregating the partners’ business data.
In most cases there is no requirement that a company be involved in only one such partnership. One could easily imagine a company that specializes in, say, warehousing and fulfillment, providing the same service to a large number of partnerships. The picture above shows a company participating in two “virtual companies”.
Previous Previous Table of Contents
XSL application scenarios: portals
Portals like “myYahoo” are familiar to many web users. They allow the client to design a custom home page with live, updated information according to the user’s wishes. “MyYahoo” allows the user to request an up-to-date weather forecast for their area, current stock prices they want to watch, news headlines, and the like, gathering data from many sources. This information is combined it into a single web page that has different parts of the screen allocated to presenting each part of the customized report.
This model can also benefit a business worker. Suppose a clerk is employed to manage the supply of a particular line of parts needed for his company’s manufacturing process. A portal could be designed to display prices or availability for certain critical components from various vendors. Information from the company’s ERP system, such as inventory and forecasted demand, can be incorporated on the same page. The similarity with the “myYahoo” type portal is the ability to gather data from a variety of resources, select according to a user’s profile, and format the data for a particular screen.
When the sources of such data can provide it in XML, XSL can be used to automate the transformation required for portals. One can imagine sending HTML streams to sub-objects on the browser as a means of managing regions for display.
Previous Previous Table of Contents
XSL application scenarios: code generation
In all of the examples above, XML is treated as data to be converted from one form to another, either for consumption by a client or by another server. Yet another way of using XML is to generate procedural code based on specifications described by XML data.
For example, IBM has recently announced the submission to XML.ORG of a technology called “Trading Partner Agreements Markup Language”, or tpaML, for consideration as a standard http://www.software.ibm.com/developer/library/tpaml.html. Trading Partner Agreements used to be a paper document created by the lawyers of two potential business partners. IBM recognizes the value of coding such agreements electronically, so that the terms of the agreement can be implemented as software. This is especially helpful for the new aspects of starting e-business with another company, such as the technical details needed to configure the middleware servers of each partner to begin the conversation.
A filled-out tpaML document can be interpreted by a program which generates Java code to configure the middleware on each side automatically. We refer to the source as an “executable document”. They can be produced using XSL, or it may be more practical to use Java business logic mixed with DOM traversal.
Of course, the software products on both sides would need to support tpaML, but they need not be the same product. Generating Java code for configuration of the servers provides a rapid way of getting set up for e-business transactions. The alternative involves manual configuration of the software, possibly writing additional code, a process which could take weeks or even months.
The sections above list just a few application categories where XSL can be gainfully employed; we expect many other usages to emerge as the technology is embraced by creative developers around the world.
Previous Previous Table of Contents
Limits of mechanical translation
XSL can solve many problems by translating XML mechanically. However, it is just one tool, and it won’t address every need for changing XML documents.
The language itself is not intended as a general-purpose programming. Unlike Java or C++, for example, variables can be set only once; they are really more like symbolic constants in that respect. They cannot be incremented, so loop counting is not possible. If there is a need to parse a “lastname, firstname” string into separate components, it can be done in XSL, but not easily. Such situations may call for the use of extensions plugged into XSL. With the Java version of Xalan, Java classes can be used to extend the power of XSL processor.
Mechanical translation must be done with care. When converting from one vocabulary to another, it is important to consider the meaning of the data between tags, not just the tag name. Even with a common tag name like &ltname&gt, we cannot be sure what the name means - customer name? company name? or something else?
In addition to the meaning of the data, the format of the data must be understood. When combining listings from two catalogs of electronic parts, for example, the specifications of particular components must be expressed in a similar standard. The working voltage of a capacitor could be expressed as a fixed value, a range, or a fixed value with a percent tolerance. The application which eventually consumes such data may understand only one form.
Both of these problems are best addressed by having the vocabularies be very well defined and agreed between companies. XML.ORG oversees the definition and development of such vocabularies within an industry, and it is important that the specifications reflect the input of all companies that will be using the vocabulary for e-business.
Previous Previous Table of Contents
Conclusions
XSL is a powerful transformation facility that provides mechanical translation of XML documents from one form to another. It can convert to HTML, another XML vocabulary, or text which is not XML at all. Many transformations can be designed using only an XSL processor, and it is possible to add extensions to the processor to support particular requirements that are not easy using only XSL.
We have studied several scenarios where XSL has a role. We recognize that these initial ideas about using XSL represent solutions to certain problems we see today, but that XSL can be used in many ways that have yet to be invented.
Finally, XSL by itself cannot address all incompatibilities between XML documents. When vocabularies are not well defined, either by the exact meaning of a tag or the exact format of the data associated with it, mechanical translation will not solve the problem. This underscores the importance of developing well-defined standard vocabularies for e-business usage under the auspices of a neutral standards organization such as XML.ORG.
Previous Previous Table of Contents