GCA
GCA Attend a GCA Conference

TECHNICAL TRACK
WEDNESDAY, MARCH 1



8:30 am
Morning Keynote: XML Stands on the Shoulders of an IT Giant
(presentation coming soon)
Benoit Lheurex, Gartner Group


Session #5: Small Devices

9:00 am
XML and Jini - On Using XML and the "JAVA Border Service Architecture" to Integrate Mobile Devices into the JAVA Intelligent Network Infrastructure (presentation coming soon)
Stefan Mueller-Wilken, Research Assistant, University of Hamburg
smueller@informatik.uni-hamburg.de


Biography:
Stefan Mueller-Wilken is with Prof.Dr. Lamersdorf Distributed Systems Group at the University of Hamburg since 1994. After finishing his master's thesis in Computer Science with a work on service mediation in distributed middleware environments, he became a research assistant in the same group in 1997. His main research circles around questions of how to integrate mobile devices into distributed system environments. Positions as visiting researcher brought him to the Distributed Systems and Technology Center (DSTC) in Brisbane, Australia and at the 'Ericsson Eurolabs Deutschland' (EED) in Aachen, Germany.
Abstract:
Since its introduction early this year, the "JAVA Intelligent Network Architecture" (JINI) has brought a new and fascinating approach to the field of lightweight middleware systems. Using building blocks such as 'resource leasing', 'distributed events' and a centralized 'lookup service' to store registered service offers, JINI offers good potential for realizing highly dynamic distributed computing scenarios with participants ranging from large scale server applications to consumer electronics and mobile systems such as mobile phones and personal digital assistants (PDAs). Among JINI's outstanding features is the ability to not only register the access path of a server application (URL, socket, etc.) as other middlewares do, but to register service proxies to be used on the client side to access the service. These proxies can be used to implement a communication method secretly shared between client and server. The transfer of a user interface to a human client is possible. Changes to a JINI application will simply lead to different proxies being registered and transferred to the client side and without any modification of the client side code becoming necessary - all simply taking place under the hood next time the leases are due. While the JINI approach brings great flexibility to the field of distributed system design, there is currently one huge drawback: JINI is inherently based on the JAVA programming language and therefore not accessible from the vast number of small devices such as WAP phones or simple PDAs in use today.
The University of Hamburg is currently developing the 'Hydepark' infrastructure (hyper distributed environment for personal appliances) which will allow for the integration of non-JAVA devices into the JINI application scenarios. As one important project a JAVA border service architecture (JBSA) is being designed to make service GUIs (like those registered as part of JINI service offers) accessible from simple browsing devices (such as WAP phones or PDAs), thus giving direct means to integrate such appliances into JINI application scenarios. Like other approaches, the JAVA border service is based on principles of introducing an abstract layer between application logic and presentation layer. This abstract layer is based on using XML and optimized XSLT processing to transfer GUI descriptions into concrete representations in XHTML, WML, VoiceML or VRML. But where Gamma's half bridge pattern or the W3C's XFDL approach rely on modifications to the code at design time, the JAVA border service architecture is based on runtime analysis of the active application and dynamic transformation into a representation as requested by the mobile client. We call this approach an 'n+0 tier design' with a mobile client being co-located to a running desktop-client application. In best case, the application wouldn't even notice the difference between being used locally from being used from abroad and the client device could change with the application being 'alive' - from direct access to access via HTML browser, on to access via WAP telephone and back to access from the desktop PC. To allow for this flexibility, the JAVA border service architecture provides numerous services in addition to the core analysis and transformation functionality:

* authentication support,
* a session management facility,
* pluggable device adapter support,
* an application factory to start and host service GUIs and
* a device- and application classification mechanism
These additional services allow a complete infrastructure for XML
based device integration with the JINI architecture.
The JAVA border service architecture is designed around a JAVA GUI analysis functionality. Using a so called 'application shadow' that runs in the same virtual machine as the application, the object hierarchy that makes up the user interface is scanned in regular intervals and the results are converted into JSML, the Java Swing markup language (JSML), an XML dialect specially designed for Swing GUI representations. This JSML 'snapshot' is being routed through the 'BSA gateway' and dispatched to one of the XSLTP instances that are being held online for each client-side representation style sheet for performance reasons. At this point in time transformations to XHTML and WML are possible. We plan to integrate support for VoiceML and VRML. Results of this transformation process are then forwarded to the 'external communication adapter' (ECA) corresponding to the target representation, where they will be offered to the client device in a manner suitable to the device class (for WAP phones using a WAP gateway, for HTML-aware PDAs through a servlet engine etc.). Client interaction such as a button being clicked or a list item being picked are caught by the 'ECA', forwarded through the 'BSA gateway' and routed back through the 'application shadow' where they are retranslated into GUI events and inserted into the Swing event queue. The application now processes these events as if they originated locally. Results of the client interaction are scanned by the 'application shadow' and the process can start over again. Early prototypes have led to very promising results with respect to performance and flexibility of the XML based approach chosen for the 'JAVA border service architecture'. This rapid prototyping has only been achieved through use of the Extensible Markup Language and XSL Transformations. Ongoing changes to the JSML design can be rapidly incorporated into the Architecture by simply adjusting stylesheets as opposed to rewriting large sections of code. As a consequence, the JBSA is currently being integrated into a first real world application scenario, where fieldworkers can use their mobile phone to gain access to company information (mostly tasks, addresses and dates), stored in central databases and usually presented through a small JAVA application they have on their PC desktops when in the office. While a lot of effort currently goes into design and implementation of the JAVA border service architecture, the 'Hydepark' infrastructure is not restricted to integration of non-JAVA clients into the JINI architecture. In other sub-projects, integration support for non-JAVA services is being realized and first prototypes on using simple PC104-based servers within the JINI environment have been built.
In short, Hydepark offers the following benefits for distributing applications:
* JINI Services become available to non-Java clients.
* All XML based target representations that allow for interaction (like XHTML and WML) may be incorporated.
* Legacy Java Applications may be distributed through the use of this architecture.


9:30 am

Using XML to Update Software in Embedded CE Devices
Ken Rabold, Senior Software Engineer, BSQUARE Corporation

kenr@bsquare.com

Biography:
Ken Rabold is a software engineer at BSQUARE Corporation. He is working on providing XML based solutions for Windows CE devices. He was worked with XML since 1998 for exchanging information between medical computer information systems.
Abstract:
The BSQUARE software deployment application incorporates XML as the basis for software deployment and device configuration for embedded Windows CE. Placing the responsibility of initiating software updates from the CE device (pull) versus a dedicated centralized server (push), the software deployer uses XML to form a custom software update description language from which update scripts can be written.
The XML based scripts are downloaded by the CE device from a web server, interpreted by the software update component, and executed. Elements within the XML file allow for updating data files, downloading and installing executables or COM objects, CE registry modification, and downloading a whole new operating system image.
This presentation describes the use of XML in the software update component, the design of the software update package schema, and how XML and the internet protocols are used on an embedded Windows CE device. The component nature of the product allows it to be incorporated into custom applications and platforms. One such platform, a Windows Based Terminal (WBT), is a natural for hosting a software deployment application. WBTs are designed to provide a low Total Cost of Ownership by running Windows NT sessions as a terminal client on low cost hardware. To help facilitate a low TCO, updating software on WBTs remotely by an administrator is an absolute necessity. By incorporating a software update component into WBTs, BSQUARE is able to provide a mechanism by which network administrators can deploy new versions of software, modify settings on the device, and through the use of active server pages on a the web server, track devices that have requested software updates as well as serve up customized XML update packages.


10:00 am
Hardwired XML
John Aloysius Ogilvie, President, Killdara Corp
jogilvie@killdara.com

Biography:
John Ogilvie received his degree in Systems Engineering in 1983. Since then, John has been a prolific software developer in the U.S., Canada and the U.K., including stints with Norpak, Bell Northern Research (Nortel), Videotron, Virtual Prototypes, Oracle and other innovators in graphic telecommunications and scientific computing Projects have included medical imaging, remote sensing/mapping, public-access, entertainment, retail and training, among others. John currently runs Killdara Corporation, a venture-funded XML product company.
Abstract:
In this session we will explore an underappreciated area of XML: how it can be used as a 'lingua franca' for communication between intelligent, automated hardware/software devices known as 'bots' (robots). Bots are already common, although usually invisible to the user.
Sophisticated websites and transactional services are built from bots, and they have been embedded in cars, appliances and computers for years. A bot is an autonomous piece of computing power which is dedicated to a specific task, and which has no user interface.
We can see a time in the near future when hundreds of millions of these cheap, simple, single-purpose 'micro-servers' are embedded everywhere. Precursors of these devices are already built by pioneers like Axis Communications and Cobalt; researchers have even built complete webservers which fit in the palm of your hand.
So these microservers are increasingly ubiquitous, but they are unable to speak to one another.There have been interesting initiatives such as JINI, but in my opinion these initiatives are overdesigned (similar to CORBA) and will not be the solution. XML will be the solution.
XML is a good foundation for inter-bot communication for several reasons:
a. It's a comparatively lightweight messaging protocol, so it fits easily into devices which have limited memory and computing power.
b. Messages can be rigidly structured, making them easy to generate and interpret.
c. It's vendor neutral and carries no licensing fees.
d. It can be implemented on any platform using freely-available software components written in Java.
My prediction is that by mid-decade we will take it for granted that we are surrounded by a constant, inaudible XML chatter among the bots. Your home/computer (deliberate punctuation) will host bots which synchronize your life with your colleagues and family, and surf the net for good deals on groceries and airfare. Your office, car, phone, PDA will all have similar responsibilities and capabilities. Cars and garages will know one dialect (DTD), and another dialect will be used between your TV set and PDA. A given bot may use ten different DTDs in a day's operation.
The chatter will be delivered as intermittent, low-bandwidth traffic over conventional TCP/IP networks, often using wireless transmission. The traffic will take a variety of forms: E-mail, web post (HTTP) or file transfer (FTP) will be the basic protocols.
The messages will be concise XML documents, using defacto industry standard DTDs. Of course, someone will first have to design DTDs for "Vending Machine Out Of Order Report", and "Flight Delay Information Request".
Where security is an issue, the documents will be digitally encrypted and signed using public-key infrastructure (PKI) techniques. Each bot will have it's own unique and legally-recognized digital ID or signature. The chatter will be much more secure than even existing financial transactions.
In summary, we miss the big picture if we think about XML as a document description language or as a way to store documents or present them to users. XML will see it's greatest deployment and impact as a 'machine language'. XML = "eXcellent Machine Language".


Session #6: Data


11:00 am
Replacing Two-Phase Commit with an XML/Internet Transaction Model

Walter Perry, Managing Director, net.uniqueness, Inc.
wperry@uniqueness.net

Biography:
Walter E. Perry, PhD. is a founder and the CTO of net.uniqueness, Inc., a New York- and London-based firm applying XML to database functions and to databased application support. He has sixteen years experience developing enterprise systems which apply distributed databased solutions in financial settlements and other transnational processing.
Abstract:
For nearly twenty years two-phase commit (2PC) has been the basis of transaction processing on distributed systems and of peer-to-peer transactions between systems. Improvements in the efficiency and availability of 2PC-based systems have been real
ized through increasingly reliable hardware and through software which has refined the definition of transactions to a granularity best suited to the environment in which they are processed. The parties to transaction processing are confident of the ability of their counterparties to execute because either both systems are distributed within a single organization or the counterparty's systems are so familiar that they can be treated with the same trust as the enterprise's own. In addition--and perhaps most crucially--the premise of two-phase commit is that each atomic transaction be identically defined by the software of the two systems. If the transactions work at all, that is proof that they are identically understood by both parties.
The current excitement over business-to-business transactions across the Internet should be tempered by the understanding that this will be an increasingly unsuitable environment for two-phase commit and indeed for the past twenty years' common understanding of transaction processing. The inherent topology of the Internet is of autonomous, largely anonymous nodes. Much of the promise of Internet commerce lies in the prospect of doing business with parties previously unknown or inaccessible. Yet even when those prospective counterparties can be identified and reached through the network, much about their systems and processes will remain opaque. It will certainly not be reasonable for a business to act as if those systems and processes are closely analogous to its own. More important, a business cannot expect the central requirement of 2PC-based transaction processing--the identical definition of the transaction--from a counterparty that was never previously an identified participant in that market or industry sector.
XML, through its inherent extensibility, is the crucial tool for defining a transaction model--and more important for building transaction processors--which work in the Internet topology of autonomous, largely anonymous nodes. We must assume a world where neither the boundaries nor the particular constituent components of a transaction are understood identically by two largely anonymous parties. Yet if a transaction is doable at all, it must be understood by each of the parties as an extension or modification of some already familiar transaction, for otherwise they could not comprehend the transaction at all. We cannot expect that one such party will know how--or be willing--to process a transaction in precisely the way, and on precisely the terms, that the other party would, or as that other party might expect its counterparty on a 2PC-based transaction to do. Yet if, through an exchange of messages containing nothing but the specifics of the proposed transaction as each party sees them, each is able to understand in its own terms what the other is proposing, then each can independently execute its part of the transaction, once it has satisfied its own definition of the data required. In other words, instead of two-phase commit, we have autonomous, asynchronous separate execution of differently-composed and differently-bounded transactions, which are nevertheless counterparts to one another in the instance of execution.
This presentation will describe how to implement such a transaction model by exploiting inherent benefits of XML. Key to this is the process of constructing a data model from the parse of a message received and then instantiating the elements of that data in (probably very different) structural terms understood by the receiver. We will describe how the XML markup describing these structures and relationships can be generated from a general-purpose process driven by the parse of each new message. We will also cover how processing constraints can be implemented, in XML markup, very differently on the two systems and yet be simultaneously applied in the execution of each transaction. Finally we will explore what this transaction processing model demands of--and teaches us about--the nature of data vocabularies in general.


11:30 am
XML Messaging and Java/XML/SQL Conversion

David Orchard, IBM
orchard@pacificspirit.com



1:30 pm
Afternoon Keynote: When XML Turns Ugly
David Megginson, Megginson Technologies, conference co-chair


David Megginson of Megginson Technologies, currently co-chairs the W3C XML Core Working Group and serves as a member of the W3C's XML Coordination Group. He led the initiative that created SAX, the Simple API for XML, and organized the XMLNews initiative, which promotes the use of open standards for the exchange of news and information. David has been a Linux user since 1993 and has been writing free software for over a decade.

Session #7: APIs

2:00 pm
EasySAX: SAX made Pythonic

Paul Prescod, Consulting Engineer, ISOGEN
paul@prescod.net

Biography:
Paul Prescod is a leading researcher and implementor of document processing technologies. His formal education was in mathematics and computer science from the University of Waterloo. His research interests include formalisms for document modelling, queries and schemata. As a consulting engineer at ISOGEN, he helps organizations apply ISO and W3C standards to large-scale documentation problems.
Among his accomplishments, Paul has been very involved in the development and promotion of new standards. He worked within the XML Working Group of the World Wide Web consortium to develop the XML family of standards and co-wrote the most popular book on that family of standards: The XML Handbook. Paul wrote the first and most popular tutorials on the DSSSL style language and the grove paradigm. He writes widely on other topics both abstract and concrete. On the implementation side, Paul can integrate a wide variety of tools and techniques. He has experience with programming languages such as C++, Python, Java and Omnimark; authoring systems such as FrameMaker+SGML and AdeptEditor and SGML toolkits such as James Clark's SP and Jade.
Abstract:
EasySAX is a high level SAX-based API for working with XML event streams in Python. Where SAX was specifically designed as a low-level API, EasySAX is designed first and foremost to be easy to use, convenient and flexible.
EasySAX has dynamic event handler dispatch mechanisms that make XML processing convenient by building on Python's dynamism. Where SAX users typically dispatch events using switch statements or hand-coded dispatch table, EasySAX builds a dispatch table automatically based upon method names and metadata.
EasySAX also combines some of the best features of tree-based and event-based interfaces by allowing trees to be built "on-demand" from portions of parse streams. This allows the performance degredation of tree building to be minimized.
EasySAX is currently in testing and the final release is expected in time for the conference.


2:30 pm
XML in the Java Platform

James Davidson, Staff Engineer, Sun Microsystems
james.davidson@eng.sun.com

Biography:
James Davidson is a staff engineer at Sun Microsystems . He is currently leading the specification for the Java API for XML Parsing. Since joining Sun in 1997, James has previously worked on the Java Servlet team as the author of the Servlet API specification and on other web technologies. James sits on the W3C DOM Working Group. He has also played an instrumental role in founding the Apache Jakarta project and continues to chair the Jakarta project management commmittee.
Abstract:
This session will provide a technical overview along with detailed examples of XML technologies in the the Java 2 Platform. Attendees will learn about current XML technologies being developed through the Java Community Process, including the Java API for XML Parsing (JAXP), Project Adelard, and XML integration in Java 2 Platform, Enterprise Edition (J2EE). Attendees will also learn how to leverage the synergistic relationship of XML and the Java technology to create powerful Web applications for large-scale enterprise applications down to small devices.



3:00 pm
Dynamic Classes API for XML DOM - Read Me First
Robert Houben, Vice President R & D, Liberty Integration Software Inc.
roberth@libertyodbc.com
Dr. Philip Mansfied, President, Schema Software Inc.
philipm@schemasoft.com
Dr. Yuri Khramov, Schema Software Inc.
yurik@schemasoft.com

Biographies:
Robert Houben has 19 years experience in the software industry. He has authored several ODBC drivers including the Red Brick ODBC driver, and the Liberty ODBC driver, as well as the Liberty JDBC driver. He has deployed eCommerce and eBusiness applications that integrate Legacy Line-Of-Business DBMS systems using Web technology for over 3 years. He is a founder of Liberty Integration Software Inc., of Vancouver Canada.

Philip Mansfied is the president of SchemaSoft and the member of the W3C SVG working group. Prior to SchemaSoft, he worked at Paradigm Develoment Corp. and taught in University of Toronto. He got his Ph.D. from Yale University.

Yuri Khramov has more than 20 years of experience in the software industry; he is involved in XML and other WEB technologies for more than 4 years. he is one of the founding partners of SchemaSoft. Prior to that, he worked at Paradigm Development Corp. in Vancouver, Canada Graphica in Tokyo, and several industrial and Academic instituions in Moscow. He holds a Ph.D. in Computer Science from Moscow Management Institute.
Abstract:
The acceptance of a technology by the Basic programming community is tantamount to becoming a "mainstream" one. The XML DOM appears to be the model and the tool of choice to deal with XML documents; that's why we decided to concentrate on better integration of the XML DOM with Basic, and particularly Visual Basic Script (VBScript).
The goal of the bizDOM project is to provide a tool and API that will make the DOM more natural, easier to use and more in accordance with the way the VB community thinks and works.
The central idea of BizDOM is a dynamic class generation based on the content of a loaded XML document. This feature allowed us to create a very simple and intuitive API well suited to VBScript. The syntax construct for these dynamic classes is called Nodepath.
The following sections describe some of the most important advantages of BizDOM.
1. Tree navigation and node addressing
The ease of addressing different nodes in the DOM tree is crucial for the acceptance of the tool by the programmers community. With the existing DOM tools, the programmer has to operate with generic terms and methods. For example, to get the element that represents fifth <line> inside the <details> section of the <invoice> document, the programmer has to write several lines of code using "getChildList","get_Name" methods, iterators, etc.
With our dynamic classes and the Nodepath notation, the application programmer is able to write clear and intuitive code like the following: Invoice.Details.Lines(5)
The Nodepaths also work for addressing attributes.
2. Collections
VB programmers use collection very widely; constructs like "for each" are ubiquitous in scripts. The current DOM tools lack collections completely, so we implemented APIs that create the VB collections that would be the most important in the VB scripts: child elements of a node, child elements with a specific tag name.
3. Default Properties
The notion of the default property is very important in VB, and non-existant in the W3C DOM spec. We implemented it such that the default property of an attribute is (naturally) its value; for an element, the default property is the value of its first child text node.
4. W3C DOM Functionality
The philosophy of the implementation was "make the common things simple by implementing special APIs, and keep all the W3C DOM available with the standard interfaces".
To provide the access to the complete W3C functionality we are exposing the "underlying" DOM objects that implement the complete W3C specification. The BizDOM class object provides access to the W3C Document object, and the BizNode object allows the user to access a corresponding W3C Element object. Through those objects, the users can create and access objects of such W3C classes as Attribute, NodeList, NodeMap, etc.
5. Implementation Details
We have implemented bizDOM as an ActiveX object atop of MS IXML DOM ActiveX object, delegating most W3C DOM functions to it. This method guarantees us full compatibility for the current version and for all versions to come. It also allowed us to reduce the time from the "conception of the product" to the beta version to little more than 3 months.
The bizDOM beta is about to be released, but we already have a number of "early adopters" signed into our beta program. By the end of February, we expect to have many customers using bizDOM in industrial applications.



Session #8: Simplification

4:00 pm
Simplifying XML: New Developments from SML-DEV

Mike Champion, Software AG's XML evangelist and member of W3C
mike.champion@softwareag-usa.com
Simon St. Laurent, Book Author, and

Don Park, CTO, Docuverse

donpark@docuverse.com

Biographies:
Mike Champion is a long-time member of the Document Object Model Working Group and an author of the core XML portion of the W3C DOM Level 1 Recommendation. He spent several years at Arbortext working on interfaces between an XML authoring system and various XML repositories. He now works for Software AG's development organization and acts as a contact with the W3C and the XML community.

Simon St. Laurent is a web developer, network administrator, computer book author, and XML troublemaker living in Ithaca, NY. His books include XML:A Primer, XML Elements of Style, Building XML Applications, Inside XML DTDs: Scientific and Technical, and Cookies.

Don Park is the CTO of Docuverse, a bleeding-edge company specializing in providing tools and services to e-commerce industry. Mr. Park has been actively consulting for the past 18 years. As a vocal member of the XML community, he has participated in the design of SAX and DOM standards. Recently, Mr. Park has founded SML-DEV group to address growing concerns over complexity in XML standards.
Abstract:
Over the last several years, eXtensible Markup Language ("XML") has generated enormous currency in the marketplace. It is sold as the universal syntax for making business information accessible, independent of the software deployed. XML was bootstrapped as a simplification of the popular Standard Generalized Markup Language ("SGML"). XML retained much of SGML's power and existing market acceptance--yet was easier to implement. Since then, XML has made great progress towards making information processes more commoditized, replaceable, and thus accountable.
As a result of its SGML heritage, XML has brought with it a document publishing bias, including features such as external parsed entities, document type definitions ("DTD"), notations, CDATA sections, and the like. However, in many business domains, especially electronic commerce, much of these carry-overs simply aren't needed. And in fact, they form the bulk of XML's complexity. They tend to increase development, testing, and training costs. And they hinder interoperability. Certainly XML is a huge improvement over SGML, however, for many domains the simplification was stopped prematurely.
In November 1999, a group of practitioners gathered to continue this simplification. By stripping XML down to the core, they hope to maintain a bulk of XML's applicability, yet relieving a majority of its complications. From the start, there was unanimous agreement to eliminate DTDs, notations, external parsed entities, and CDATA sections. The question then became, how much further? As of January 2000, two key SGML features were still up for debate, attributes and mixed content. There are valid reasons for not wanting to include either of these in a simplified markup language. However, there is an equally valid reason to stop just short of these syntax elements. So, rather than choose, the group decided that it may be better to provide two simplifications.
The first simplification, Common XML, maintains both attributes and mixed content. It will be an XML usage guideline, highlighting the most commonly used aspects and clearly marking troublesome areas.
The second simplification, Simple Marker Language ("SML"), goes much further. A element can either have a text value or a list of child elements, but not both. Attributes are gone and so is mixed content. Namespaces are still being discussed as is the <degenerate/> tag. This minimal bounded tagging language may be especially useful in environments where high performance, a minimal footprint, and/or guaranteed interoperability are important, such as B2B messaging. Further, with simplicity comes a better foundation upon which layered structures can be built. For example, a special text tag could be added to allow for mixed-content. A coloring layer could be added to support attributes. And a rhythmic embedding could be used to express alternating map and list structures, like those found in the GROVES model.
When this group is finished, the Common XML usage guideline and the Simple Marker Language specification will provide simplified subsets of XML, allowing a more granular learning and adoption of tagging systems.


4:30 pm
SML and Ockham's Razor: Too Close a Shave?

Evan Lenz, student, North Seattle Community College
elenz@ricochet.net

Biography:
In 1998, Mr. Lenz received a Bachelor of Music degree from Wheaton College (IL), with majors in piano performance and philosophy. Making a living as a securities trader, he is currently studying web application development at North Seattle Community College and living with his wife in Seattle.
Abstract:
"Entities are not to be multiplied beyond necessity"
-- William of Ockham
The stir of support behind the recently established SML-DEV group is prompting the invocation of Ockham's Razor against XML 1.0. The claim is that, for many simple applications such as e-commerce transactions, XML retains too much unnecessary baggage inherited from document-centric SGML. A drastically simplified subset, "Simple Marker Language," has been proposed. Using only XML's "essential" features, it will purportedly be easier to learn and implement.
The notion that a subset is necessary at this point in time undermines precisely what is revolutionary about XML--its ability to function across many types of applications without losing its identity as one language. Whether for storage, document display, data interchange, or e-commerce messaging, XML has achieved an impressive compromise, and, in terms of industry support, has hit a sweet spot. XML is optimized for broad usage precisely because it is not optimized for any particular usage. The attempt to isolate one application domain and create a subset for it is a classic case of premature optimization.
XML is also revolutionary in its human readability. The advantage to structuring data in a text file as opposed to a binary format is that people can peek inside an XML transaction, for example, and easily edit it by hand. This allows for robust and easily replaceable systems. In the proposed simple subset, the removal of attributes would severely hamper the human readability that made XML so revolutionary. Attributes are conceptually distinct from elements; they provide us with a separate, nonrecursive channel of markup which allows us to structure data in more logical, human-readable ways.

The saving grace in XML is that, while our parsers must still support the XML specs, we don't have to use every XML feature in our applications. If the prospect of structured messages, say, that use only element and text nodes appeals to us, then XML gives us that freedom--without splitting itself into confusing dialects.The burden is on SML-DEV to demonstrate that the speed and ease of implementation resulting from a simplified subset is so compelling as to warrant the splitting of XML into subsets. This, of course, would not be XML anymore, at least not the XML we know--the one that allows us to speak the same language, choose whatever parsers we want, and use whatever features we like. Another way of stating Ockham's Razor is particularly appropriate here: "Plurality is not to be posited without necessity."



5:00 pm
Tired of complicated specifications? You just RELAX!

Makoto Murata, INSTAC XML SWG,
Masayuki Hiyama, INSTAC XML SWG

Motohiro Kosaki, Matsushita AVC Multimedia Software

Biography:
Murata graduated from Kyoto University. He has participated in the XML activity at W3C since 1997. He is also the chair of a Japanese committee (INSTAC XML SWG) which published the Japanese XML Profile as a JIS technical report. He is interested in theoretical aspects of SGML/XML, especially the hedge automaton theory.

Hiyama is a member of the W3C SYMM WG. He is also a member of the INSTAC XML SWG. He has authored a number of DTDs and is interested in hedge regular languages.
Abstract:
RELAX (REgular LAnguage for XML) is a language for representing regular sets of XML documents as grammars. A RELAX grammar generates a set of XML documents. Conversely, XML documents can be validated against a RELAX grammar.
RELAX consists of RELAX core and RELAX modularization. RELAX core provides modules, which declare and constrain elements and attributes in a single namespace. The design of RELAX core (Version 1.0) has been completed, and this presentation is mainly concerned about RELAX core. RELAX modularization provides mechanisms for attaching namespaces to modules and combining these modules to form a single grammar. A whitepaper of RELAX modularization is expected to be released in early 2000.
A RELAX module consists of rules and patterns. Intuitively speaking, rules correspond to element type declarations and parameter entities used therein, and patterns correspond to attribute list declaraions and parameter entities used therein. As a special case, a RELAX grammar of a single namespace is a RELAX module.
RELAX is based on the theory of tree (or hedge) automata. From a RELAX grammar, one can effectively construct a hedge automaton. By executing this hedge automaton, XML documents can be validated against the grammar. Operations on hedge automata can be applied to RELAX grammars so as to examine their properties. In particular, one can examine if one RELAX grammar is upper-compatible with another by computing the difference of two grammars.
RELAX is more expressive than DTD in representing structural constraints on elements and attributes. RELAX, however, does not provide mechanisms for declaring entities, notations, and default values, which have been captured by DTDs. Rather, RELAX is intended to be used in conjunction with DTD; XML documents containing DTDs are first parsed by XML processors and then validated against RELAX grammars.
Unlike XML Schema of W3C, RELAX does not affect the information emitted by XML processors. Thus, existing APIs such as SAX and DOM can be used without loss of information, even when the XML document has an associated RELAX grammar. Information embedded in RELAX grammars can be obtained by parsing RELAX grammars as XML documents, if necessary.
DSD is another proposal based on the tree automaton theory. In comparison to DSD, RELAX is simpler, internationalized, and provides rich datatypes.
A RELAX validator receives a RELAX grammar and an XML document. The validator first invokes some XML processor to parse the grammar and document, and then recieves the result via some API. The current prototype uses DOM to access the RELAX grammar and the SAX-like API of XML4C to access the document. A RELAX validator reports either "This document is valid" or "This document is invalid." Some error messages and warnings may be reported as well.
A RELAX validator has been developed in C++ and its source code is available under GPL. The construction of automata from content models is done by an automaton construction tool kit called Grail. A converter from DTDs to RELAX grammars has been developed in Java and is also freely available under GPL. The XML spec DTD was converted to RELAX by this program and then revised by hand.

MANAGEMENT TRACK --

TECHNICAL TRACK (Feb 29, March 2)


Attend a GCA ConferenceBecome a GCA MemberBuy a GCA Publication
Today's News Digest
What is XML?What is SGML?ICEGCA's Mail.dat
Technical CommitteesTechnical ResourcesTargeted InitiativesGCA's GRACol
What is GCA?GCA Press ReleasesGCA MembersGCA's ICCContact GCA
GCA - Phone: +1 703-519-8160   Click Here For Legal And Technical Information
Click Here For Legal And Technical Information email: info@gca.org