XML schema design for business-to-business e-commerce
Arofan T. Gregory
Find


Abstract
No abstract was provided for this paper.

Contents
  1. Introduction
  2. XML and EDI
    1. The EDI heritage
      1. The good
      2. The bad
      3. The ugly
    2. What does XML bring to the party?
      1. Easily processable syntax
      2. Tools
      3. "Webbiness"
    3. Why XML is not an unmixed blessing
      1. Ignorant application
      2. Arrogant newcomers
      3. "Messages" aren't (only) documents!
    4. Schema as the answer
      1. Strong datatyping
      2. Manageability and componentization
      3. Extension/refinement: less is more (more or less...)
      4. Data and documents
  3. How do you work this thing?
    1. The goal of business libraries: interoperability
      1. Business-to-business vs. application-to-application
      2. Vertical orientation vs. horizontal orientation
    2. A suggested process for creating your business documents
      1. Process outline
      2. A couple of minor tips
      3. Existing resources - using EDI
    3. Initiatives, standards, and libraries - using XML
    4. Process and context, messaging, security, and architecture
      1. Process and context
      2. XML messaging
      3. Digitally signed XML
      4. Summary
  4. Where no one's gone before
    1. Interoperability and scalability: future requirements and capabilities
      1. EDI was small
      2. Dynamic trading
      3. Portal services vs. point-to-point
      4. Negotiating business process and contracts
      5. Maintenance of systems across the virtual marketplace
      6. Industry-vertical information demands
    2. Global trading
      1. Regional variation
      2. Customs, transport, and the supply chain
      3. Internationalization and translation requirements
      4. The repository
    3. Summary

Introduction
This talk addresses design considerations for maximizing the utility of XML schema languages in creating documents for business-to-business e-commerce applications, from the perspective of a developer designing an e-commerce system or a manager responsible for system implementation. In discussing the usefulness of XML Schemas, it touches on many related technologies, but restricts itself to the technological enabling of e-commerce, avoiding discussion of the many legal and business issues as having no direct bearing on schema design.
As never before, the emerging trends in business-to-business e-commerce are demanding greater levels of interoperability. This is very much a continuation of the original vision of SGML: standardization enabling the use of complex information across platforms and applications. The radical redefinition of the domain across which information may be reused has raised the stakes, and places much higher demands on the technology.
Of all current standards, XML Schema is the most important in living up to this challenge. The power of XML is also its greatest weakness: on the one hand, we can describe any data that our Internet applications care about, to whatever level of detail is needed. On the other hand, we all need to agree on what that data is. XML alone is not enough.
Previous Previous Table of Contents
XML and EDI
The e-commerce world is clearly divided into two camps: EDI and XML. The interplay between these camps is interesting to watch - members of each have gone and flirted with the enemy, generally producing interesting (but not extremely useful) results. Both groups are faced with the same challenges and the same frustrations. They have different strengths, and different weaknesses. This section attempts to summarize how to combine the best of each, and how XML Schema gives us the ability to do this.
The EDI heritage
EDI - whether X12, UN/EDIFACT, or any other flavor - was an early technology that helped many large companies realize a tremendous vision, generally at great expense. In this way, it is extremely similar to SGML. The vision of electronic purchasing is very seductive: it increases efficiency, saves money, and offers capabilities that were not available to people working with traditional systems. EDI - used in its restrictive sense, because in truth we are all involved in "electronic document interchange" regardless of our preferred syntax - was the only way to realize this vision for many years.
The good
EDI is most notable in two ways: implementors from the EDI world have ten years and more of experience wrestling with the problems that have gone mainstream with the popularity of Internet-based B2B e-commerce. We are foolish to ignore this experience, whether in terms of the technology problems and issues that exist in this space, or in the ideas and lessons learned in the standardization process. EDI messages today provide us with the best standard descriptions of practically useful semantics for e-commerce.
It's not as simple as copying EDI messages in XML syntax, however - although many seem to feel that this approach is all we need. If you take any EDI "standard" message set - be it X12, EANCOM, UN/EDIFACT, OBI, or any other - you will find that many of the tags are not useful, or are not actually used in real-world implementations. A big part of leveraging the EDI experience lies in knowing which standard EDI constructs were successful and useful, and which ones were anomalous.
One good place to look for this kind of information is in the efforts to create common subsets of the EDI standards - the most prominent of these efforts is SIMPL-EDI, which recommends a set of "core" EDI constructs to be used for the standard procurement scenario. There are other, similar efforts elsewhere in the EDI world.
Another positive lesson to be gathered from the EDI world is the value of well-done documentation. As compared to SGML and XML DTDs, EDI implementation guidelines are generally superior. (As an example, look at the OBI specifications for the purchase order message, available for download on their website.)
The bad
EDI is not only a good thing, however. It is also a really bad thing. The more negative aspects of EDI languages include tag bloat and syntax overloading. Because implementation guidelines are not formally explicit, there is a high degree of latitude in how any given segment is used. Sometimes, developers simply use things for non-standard purposes. While it is true that a finite set of tags can describe all useful data structures, this approach generally results in a very large tag set that is difficult to learn and utilize. EDI has embraced this approach, and it has caused many problems with non-standard implementations.
EDI provides us with examples of messages that simultaneously provide multiple ways to encode a single bit of information, implemented such that some of the standard constructs are overloaded. This sort of thing is the direct result of the "kitchen sink" approach to building standards. As people in the EDI standards world are fond of saying: "everybody uses just 20% - the problem is that none of them use the same 20%."
Ultimately, many of these problems stem from having a syntax that does not have a mechanism for formal validation. There is no easy way to determine conformance, resulting in very high integration costs, tag bloat, and overloading.
The ugly
Consider the following illustrations, and ask yourself: "What am I looking at?"
(I don't know about you, but I don't ever want to hear my five-year-old daughter ask, "Mommy, why did Daddy go blind?")
What does XML bring to the party?
As demonstrated above, XML represents a relief from some of the more negative aspects of EDI. It has many well-known minor benefits: it is cross-platform, it understands Unicode natively, it is human-readable, etc. These are not the specific items that set it apart from EDI in the most important way, however.
Easily processable syntax
The fact that XML is a formal meta-language that is susceptible to standard machine validation is a major differentiator. The existence of XML parsers means that applications don't have to do their own structural validation. This alone reduces the cost of integration by minimizing the amount of code that needs to be written by developers.
Further, it is very easy to render XML documents as DOM trees, event streams, or other programmatic models. Many of these are supported in standard development packages, further reducing the cost of building applications. Ultimately, this stems from having a syntax that is designed to carry a wealth of metadata, and to make it available to processing systems.
Tools
XML - as opposed to SGML - increasingly has good tools support. Browsers, development tools, code libraries, distributed programming objects like COM and javabeans, repositories, databases, editing tools - the list of XML-enabled software packages is long and growing. These are not all specialized tools for working with just XML - in many cases, traditional enterprise tools have expanded to support XML. Not only will this reduce the cost of software, but also the cost of integration and system maintenance - XML programming skills are becoming increasingly more common.
"Webbiness"
Another - albeit less measurable - advantage of XML is cultural. Because XML is a web technology, it encourages experimentation and the "running code" ethic. Thinking tends to be more revolutionary, and developers at all levels tend to be self-empowered. EDI, as a result of its long history, tends to be a more conservative type of high-tech culture. In some ways, the "webbiness" of XML enables us to address old problems with fresh energy and enthusiasm.
Why XML is not an unmixed blessing
Just as with EDI, XML is not an unmixed blessing. It has been misused by those in the EDI world as well as those from the XML camp.
Ignorant application
Something that struck many people in the EDI world as an obvious way of doing "XML EDI" was to take the EDI syntax and describe it with tags. Thus, you could have an element like <SEGMENT CODE="17485"/>. There is a real problem with this approach: it allows you to pass EDI messages back and forth over the wire, but does not allow you to leverage any of the structural validation that is an inherent part of an XML implementation. (You will notice that most EDI users are already capable of sending EDI messages over the wire to those who can understand them.) The point of XML is that it makes business semantics easy to read, easy to process, and easy to validate. This approach to XML-EDI does none of these things.
To their credit, both X12 and UN/CEFACT have independently realized this. This phenomenon takes place because it allows EDI users to say that they are "doing XML," without having to put any additional effort into doing it right. What is sacrificed in doing this is any meaningful benefit that would result from using the new technology.
Arrogant newcomers
The arrogance with which some members of the XML community approach the very real challenges of business-to-business e-commerce is truly astounding. In truth, this is more indicative of naivete than it is of general personality defects - XML does possess many technological advantages in solving the same problems as EDI, but that does not mean that the problems themselves are simple. A corollary to the kind of arrogance that is sometimes seen when XML developers do "EDI" is taking a completely US-centric view of the problem. International trade has many complexities that are not found in domestic e-commerce in the US.
"Messages" aren't (only) documents!
Another negative aspect of XML technology is that it comes to us from an information-publishing tradition, with its own set of inherent assumptions about what aspects of functionality are important. This is most evident when we consider how XML DTDs are used to validate documents: they can validate the structural aspects of documents with great power, telling us where the rules have been violated, and how. What they fail to do usefully is datatyping: there is one basic "string" datatype used for leaf-node element content, and a set of strange and not terribly useful datatypes for attribute values (loose numeric types, strings without whitespace, etc.)
The key thing to realize here is that EDI messages are "documents," in one sense, but they are also something more - they are "messages" which require very strong data validation to be useful in an e-commerce setting. One can look at e-commerce as a set of relational systems talking to each other over the wire with documents/messages. It becomes easy to see why having strong data validation - the kind of validation that evolved in the EDI world around codelists, for example - is so important. XML wasn't built to do this kind of validation.
Schema as the answer
XML DTDs are simply not enough. However, XML schemas allow us to solve many of the technical challenges presented by e-commerce requirements, and to leverage the best of EDI technology.
Strong datatyping
Schema languages contain native datatypes that align with the standard datatypes of some programming languages: strong numeric types like signed integers, floats, doubles, etc.; enumerated lists; date, time, and datetime; and, of course, strings. XML Schema languages such as SOX, XDR, and the W3C Recommendation for XSDL further allow us to describe custom datatypes that are even more useful for particular applications. It becomes possible to capture code-lists as enumerations, to limit the length of string types to fit into typical database fields, and to describe numeric types very exactly.
Manageability and componentization
Further, some schema languages (SOX and XSDL, for example, and to some extent XDR) have inherent capabilities for reusing components by doing the kind of importing and referencing that are found in programming languages such as Java. It becomes possible to take another person's schemas - assuming they are publically available in the correct schema language - and to incorporate them into schemas that you are creating.
This is not a problem-free area, largely because of a lack of a standard scoping mechanism. There are some solutions here, however: the ability to use hierarchically structured namespaces in SOX, for example, allows tight control of element re-use, something not possible with the standard XML namespaces and DTDs. XSDL makes provision for even better scope control, and this could potentially become a standard mechanism for component reuse.
Component reuse is very important when we are seeking to produce interoperability, simply by helping us to guarantee leaf-level correspondence across business documents. (This point will be addressed in more detail below).
Extension/refinement: less is more (more or less...)
Another aspect of XML schema that helps us solve the tough problems of e-commerce is the ability to not only reuse another person's data models, but to "refine" them. In the best case, this mechanism gives us access to the chain of inheritance of a given data structure, in the same way that class-based object-oriented programming languages do. You can take another person's data type or element structure, and "subclass" it by adding (and, in the case of XSDL, subtracting) pieces that you need specifically. At runtime, processing applications can identify what the differences are between the parent class and it's subclasses, enabling polymorphic processing (that is, using a child in place of the parent, where the parent was required) and default processing (handling just those bits of the data that you understand, instead of choking because some of the data is not what was expected).
The application of this capability from a design perspective is that we can build minimized components in anticipation that they will be extended to fit particular requirements. Less becomes more.
For supporting Global interoperability, the ability to extend, reuse, rename, and refine other people's components is a major enabling technology. While the capability alone is not sufficient - we must have well-documented (and preferably automated) methodologies for doing this - it does give us the raw power to solve many difficult problems in accommodating required variation within a class of business documents.
Data and documents
Schema - when properly employed - also makes it fairly easy to translate document structures (the business "message") into normalized relational structures. This is one of the places where XDR has been effectively used. If we remember that e-commerce can be seen as a system in which relational systems trade information by transmitting documents, the "graphing" capabilities of XML schema languages can provide us with lots of power. DTDs couldn't do this effectively - especially since the strongest datatyping (such as it was) existed in attribute values, and the simplest way to handle graphing requirements was to use element structures. DTDs just weren't designed to support this requirement.
Previous Previous Table of Contents
How do you work this thing?
The title of this section is taken from a song describing the plight of those overwhelmed by any type of technology - whether primeval biological technology or the kind of meta-data-laden, heavily abstracted technology we are dealing with here - without having recourse to the reference manual as an option.
How do you work this thing? / That's what I wanna find out. / Everything it connects with / Only leaves a kink in the spout. (Robyn Hitchcock)
In many ways, this describes the internal dialogue of the typical would-be XML Schema developer. I hope to point out many of the issues that should be taken into consideration, and to suggest a basic process for writing schemas for e-commerce, as distinct from traditional document analysis activities for producing DTDs.
The goal of business libraries: interoperability
It is always a good idea to remember what your end goal is, and when you are creating XML Schemas to describe business documents, the basic requirement is simple: interoperability. Even when business documents are intended to be used in a point-to-point scenario (and I believe that this kind of implementation will become less and less common as the e-commerce revolution moves forward), you are still designing documents that hopefully will be usable by trading partners you do not do business with today.
In scenarios where there is some kind of business portal or trading community, the requirements for interoperability rise rapidly. The whole point of having portal is being able to easily communicate with many other members of the community, without having to do a one-off integration with their business systems.
Business-to-business vs. application-to-application
One of the issues here is the debate between business-to-business approaches to e-commerce interoperability, and application-to-application approaches to the same problems. What lies at the heart of this debate is a question of what it is that we are encoding in our documents - which set of metatdata is most important.
Business-to-business applications focus on business semantics, using an approach that is as old as SGML - "If you tell me what it is, I can figure out what to do with it." Because this is a highly abstracted approach to using document interfaces for e-commerce systems, it is a tougher problem space than that addressed by application-to-application approaches. These - by interoperating at the level of the processes performed by typical business systems - pass programmatic interfaces in document form, as opposed to semantic ones.
There is a problem with this approach, however, that becomes worse over time. We live in an age when the capabilities of the most common processing applications grow rapidly, and the predominant systems today may not be in use at all in two years. By hard-coding the application interfaces of our systems into our business documents, we are virtually guaranteeing a maintenance nightmare for ourselves. It is easier in the short term, but more expensive in the long term.
Vertical orientation vs. horizontal orientation
Another way to look at interoperability issues is to examine the kinds of standard business languages that have been proposed to date (and the list is long, and growing). In both the EDI world and the XML world, there has been a phenomenon in which, to render the problem of defining useful semantics tractable, a vertical industry has described business data with the semantics that are commonly used within that vertical. This was a very common - and generally successful approach - used in the SGML world for technical documentation: DocBook, CALS, and PCIS are all examples of the successful use of this approach.
The problem of interoperability is greater in the e-commerce space, however, and for a very simple reason. While many businesses within an industry use a single set of semantics to refer to the technical information describing their products, buyers in the e-commerce world inevitably want to purchase goods from other industry verticals. Some very successful industry-specific initiatives such as IOTP and RosettaNet have a good deal of "horizontal" value, but because they have defined semantics in terms of a vertical industry, they have a hard time translating into other vertical industry sectors.
A suggested process for creating your business documents
What follows is a brief description of the basic process that I would recommend for those looking to create business documents. I recently saw a presentation in which the speaker claimed that there was no documented methodology for writing business schemas. This is both true and false: while there is not a specific literature around the creation of XML schemas for e-commerce, there is a parallel literature that can be of great value, coming from the SGML methodologies for performing document analysis and writing DTDs. This section is presented in reference to the "known" document analysis approach to building these types of applications. For those looking for a single source of information, I would recommend the book "Developing SGML DTDs" (Maler, Eve; El Andaloussi, Jeanne. Developing SGML DTDs: From Text to Model to Markup. Englewood Cliffs, NJ: PTR Prentice Hall, 1996. Extent: 560 pages. ISBN: 0-13-309881-8.)
Process outline
The basic steps for creating document structures such as schemas and DTDs are very similar to those of any kind of software application or systems development. This is just a quick outline - more detail is given below about particular points of interest.
A couple of minor tips
In my experience, it is a very good idea not to use mixed content in XML business documents. Some schema languages - such as SOX - don't even allow it. Whitespace handling in XML is a topic of much debate, but it can be fairly neatly avoided by simply not allowing mixed content in your documents.
Don't be afraid to "plagarize." When defining element structures, data types, codelists, etc. it is a really good idea to literally use what others have produced, particularly if there are relevant standards for what you are doing. In the US, there is a legal argument that claims that "business forms" such as XML schemas are not even intellectual property. Most people and organizations who publish XML schemas and implementation guidelines are happy to see others build on what they have done.
Design for extension and refinement. If you correctly analyze your data structures in terms of the type relationships, you will establish a set of archtypal components that can then be usefully subclassed. This approach will maximize the flexibility of your data structures, and can be seen at work in many places - look at xCBL, ebXML, and even the modelling work done in the UN/EDIFACT working groups.
Existing resources - using EDI
There are many EDI sources that are extremely useful when creating business schemas. Chief among these are the standards bodies or similar initiatives (X12, UN/CEFACT, EAN, OBI, RosettaNet). Typically, you can get implementation guidelines that will give you fully documented transaction sets consisting of the documents needed to carry out a particular exchange. In some cases there are also good process models.
It is strongly suggested that, when looking at any given EDI transaction set, that you have recourse to someone who has worked with that particular transaction set before, and understands which segments are used in reality, and which ones are hold-overs still included only for legacy support, or are simply not used.
The EDI messages represent the single best source of information today regarding the semantics of e-commerce messages. There are many forward-thinking movements within the EDI world that are worth paying attention to: the BSR, object-oriented EDI, interactive EDI, and others.
Initiatives, standards, and libraries - using XML
What follows is a list of some of the better resources that exist inside the XML world. In some cases, these initiatives span both the EDI and the XML technology spheres. Each has a brief description, but is worth taking a close look at.
Process and context, messaging, security, and architecture
There are a whole set of "architectural" aspects to e-commerce systems that cannot safely be ignored when designing schemas to describe the messages exchanged by applications. This section gives a brief summary of the major areas from this perspective.
Process and context
The EDI standards world has developed a very intriguing notion of "context," which is the set of descriptors that define a single point within the entire realm of e-commerce exchanges. "Context" involves not only the specific business process, but also such factors as the industry which that process is serving, what regional or international aspects exist, and so on.
The idea is that "context" can provide a useful way of characterizing the applicability of a particular message within the entire e-commerce space. In reality, one Purchase Order cannot be used by everyone, because there are different demands on the data that must be transmitted within different contexts to support the same basic business process. (The joke goes, "One size does not fit all, and size matters!")
There are also a number of interested groups that have attempted to define standard business processes and sub-processes, most often using UML as a modelling technique, and some XML-based syntax for describing the process flow. (This is very important work, and the best example is XMI.) ebXML is using both context and processable XML "choreography" descriptions to help define the way in which data components may be used and extended, and is further attempting to define reusable sub-processes that will themselves function as components for describing business processes.
When designing documents meant to maximize interoperability, one must consider the effects of different business processes and contexts. This will influence the way in which data elements are extended and refined, particularly, and may well impact how a componentization scheme is organized.
XML messaging
XML messaging is an interesting topic, but is not one that we will cover here, other than to point out one important rule: Do not mix "messaging" information with business data. What this means is that you should let things like guaranteed delivery, transmission errors, addressing across multiple protocols, and routing information live at the level of the "envelope" that carries your message, whether that is a BizTalk wrapper, a MIME envelope, or anything else. Don't require a delivery application to read your business document, or you will find yourself severely limited as to how your documents can be used.
Digitally signed XML
Digital signatures require that, once signed, an XML file not be edited. Any editing will invalidate the signature. This has a tremendous impact on how you design your documents, and is one of the main drivers for the point above. When analyzing a business process flow to create the supporting messages, be very careful that you do not require a message to be edited after it has been created and signed.
Summary
This discussion of how to create business documents with XML schemas would not be complete without pointing out a simple, important fact: You probably will not have to write your own business documents. If you're smart, you will leverage all of the work that already exists in the various standards and examples discussed above, which means that it may be easier to simply tweak someone's existing product, and hopefully do it with their cooperation and within their customization methodology.
If any single effort currently deserves attention, it is the ebXML Harmonization effort. This will be further discussed in the next section, but is probably the best example to date of an attempt to truly reap the benefits of both EDI and XML.
Previous Previous Table of Contents
Where no one's gone before
The use of XML schemas in business-to-business e-commerce is in its infancy. The simple fact that the W3C was late producing a draft recommendation for a standard XML schema language has done much to halt the use of schemas in e-commerce. This is rapidly changing, and the next section is an attempt to describe those aspects of this area that deserve notice, and that will potentially impact your decisions about schema design.
Interoperability and scalability: future requirements and capabilities
First, we must look at what the requirements are likely to be from the standpoint of scalability.
EDI was small
EDI was used by a relatively small number of organizations. Due to the prevalence of the internet inside the business world, and the growing availability of software packages that can be used by small- and medium-sized businesses to get into the game, we can anticipate a much larger number of trading partners than has heretofore been the case.
This is especially true because the initial cost to get into the e-commerce game is being greatly reduced by the web-based portal phenomenon. If you only need to integrate once to connect with a reasonable number of trading partners, then we will see that issues around scalability are directly more critical.
Dynamic trading
Another aspect of e-commerce in the future is that trading will become more dynamic. With architectures designed to provide rich metadata about the product and service offerings on the web, then it should become possible to negotiate contracts, and conduct transactions, in real time using computer programs designed to leverage this metadata.
The implications of this for schema design are many, but mostly focus around the assumptions a designer can make about how data must be described. It is no longer enough to determine the semantics used by two trading partners, and to encode that set of semantics into their business documents. The burden will be to design generally useful semantics that can be easily extended to account for nuances introduced by new trading partners. This is not an easy task! Polymorphism, of course, will be very helpful if we do our document design properly. by designing super-classes intended to be renamed for specific use.
Portal services vs. point-to-point
In the EDI world, almost all business services (as distinct from VAN-based services around actually transmitting the messages) were provided either by the buyer or the supplier. There is no concept of value-added "middle-man" services at the level of the trading community. The existence of portals is changing this. Portals can now serve as a place where value can be added in many different ways to support many different types of transactions, from translation of catalog content to payment processing to tracking of shipping to visibility up and down the supply chain.
Because this kind of services provision impacts the flow of a business process (its "choreography" in terms of message exchange) it directly impacts the data that is carried by each of the messages making up the overall process.
Negotiating business process and contracts
Further, in a dynamic trading world, it is possible that business process and contracts will be negotiated on the fly, according to metadata that trading partners supply about themselves. This places a huge premium on being able to handle different "contexts" with the same basic message, and to be able to vary the content of particular messages based on how the business-process exchange is defined at run time.
Maintenance of systems across the virtual marketplace
Virtual marketplaces existing at specific portals make the problem of software version-compatibility a nightmare. With enough metadata, and with usable default processing, componentization, and support for polymorphism, the issues of backward compatibility are greatly simplified. If your trading partner can describe which versions of which data objects they support, then all-at-once migration from one message format to another is not required for every member of the trading community.
One good example of how such a problem can be solved is to look at DCOM, and to extrapolate from there how the finite nature of the problem space can be used to build on this type of example.
Industry-vertical information demands
One aspect of industry-specific marketplaces will be having a mechanism to accommodate information sets specific to that industry. Again, this problem can be solved through good design, and the judicious application of extensions, default processing, and polymorphism. This is really an aspect of "context," and one that will consistently make demands on our "standard" message structures.
Global trading
One of the great promises of e-commerce portals is the existence of a global network of virtual marketplaces, capable of being interconnected. The ability to monitor the international supply chain could solve many very difficult problems that have remained intractable for many years. Assuming that it is possible to create such a network - and theoretically, at least, it is - there are a number of issues that will come into play.
Regional variation
Yet another aspect of "context," the differing trading practices, regulations, and conventions existing in different regions of the world will need to be accommodated, putting even more emphasis on the ability to default process and extend/refine data structures. This also has many architectural implications, but these are mostly not within the realm of business document design.
Customs, transport, and the supply chain
International trade has long been plagued by a set of very tough requirements, many the legacy of complex, paper-based systems. In order to determine the cost of a purchase, the cost of shipping must be determined. Transport - often multi-modal, and by definition crossing international boundaries - must be arranged, and letters of credit must be requested and obtained. Even without considering the various forms of insurance, repackaging, and the third-party arrangements through which much of this trade is accomplished, we can see the need for an inter-connected network of transactions, whose documents are integral to the processing of other transactions. (For example, without proof of various other transactions, many banks will not honor letters of credit.)
There are several impacts on business document design: the focus shifts from designing a single document, to identifying cross-document reuse of information, such that it can be managed across the entire interlocked set of transactions. A premium is placed on having a solid componentization scheme, and further, being able to easily place this information into persistent database stores that can help manage the entire flow, and serve as a source for the required paper documentation that enables shipments to pass through customs.
While this sounds daunting, the promise here is immense. If we look at the current inability to track the location of mutlimodal transport en route - mostly because different shippers use different tracking numbers - then we can see that there is a huge value to managing the supply chain. Annoying issues such as government-required statistics reporting become simple value-add services at the portal. The list goes on and on. Within the EDIFACT world, there is an awareness of many of these problems. What is lacking is any idea of how to implement a solution. With schema-based messages, we begin to see our way to how such a system could be built.
Internationalization and translation requirements
Translation of business information is not much needed, with the notable exception of catalog content. At the same time - and especially with international trade documents, which frequently contain several languages and currencies as different parts of the same document - these issues cannot safely be ignored. To give a brief list of examples of the effects of multi-language and international support requirements on business documents:
One thing to take into consideration is the extreme difficulty of combining good structural validation and multiple "parallel" language-versions of content in a single document. Add to this the fact that most tag-enabled translation programs are not set up to handle this kind of data structure, and you will quickly see that having parallel documents makes much more sense than having parallel elements in the same document. Note that this places a requirement to have enough data to be able to establish this association between documents.
The repository
As we get further into the possibilities moving ahead, we begin to find that repositories, supplying schemas and other information to processes at run time are a critical aspect of the portal-based systems. Rather than being mere distribution points, repositories will need to be well-designed to meet the needs of run-time systems. We may find that the standard registry/repository interfaces currently being proposed by XML.org and ebXML will be critical in enabling a global network of trading communities. Such functionality as default processing and polymorphism requires access to the extended schemas, and negotiated choreographies and business contract will also require trading-partner profile information that will only be encountered at run-time.
All of the significant architecture specifications - CommerceNet's eCo Architecture; ebXML, and BizTalk, to name three - understand the importance of having a functional repository to perform these kinds of tasks.
Summary
When HTML was first published, the hypertext experts of the SGML world claimed that it was insufficient to do "serious" hypertext. No doubt they were right, but the sheer demand for any amount of functionality in this area, and the simplicity of HTML, made it a phenomenon. Now, more "serious" applications of Web-based technology are driving the data formats used in more complicated directions. Undoubtedly, we will make many mistakes in the use of XML-based technology in the e-commerce arena, and hopefully learn from them. It is also true, however, that the use of XML schemas is a critical part of developing usable systems in this area, and today is a focus for the development of useful data structures for Internet-based trading.
Previous Previous Table of Contents