GCA
GCA Attend a GCA Conference

Extreme Conference Logo

Extreme Markup Languages 2000

Click on highlighted titles for visual presentation.

Tuesday, August 15, 2000

11:00 - 11:45 Plenary
Meaning and interpretation of markup
C. M. Sperberg-McQueen, W3C, MIT Laboratory for Computer Science, Claus Huitfeldt, University of Bergen, and Allen Renear, Brown University

SGML and XML markup signals the occurrence of specific features in a document; based on the markup, the reader may make certain inferences about the marked-up material. If the meaning of element types is expressed formally, then the task of interpreting the markup at a particular location in a document may be formulated as finding the set of inferences about that location which may be drawn on the basis of the markup in the document. Several different approaches to this problem are outlined; they vary in complexity, and indirectly provide a measurement of the relative complexity of different approaches to marking up particular kinds of information.

11:45 - 12:30 Plenary
Knowledge Engineering for the "Ferret" Analytical Engine
James David Mason,Chairman, ISO/IEC JTC1/SC34, Oak Ridge Y-12 Plant

The Topic Map standard and supporting implementations can facilitate the reliable semantic interchange of structured information. Semantic interchange is facilitated by using two levels of metadata: the schema abstraction (a description of how an instance must be structured in order to be valid) and the Topic Map abstraction. Achieving semantic interchange at the schema level requires a mapping from every construct in one schema to the corresponding construct in the other, so the number of mappings increases by the number of existing schema. In contrast, a Topic Map can capture the essence of a schema structure and then define Topic Occurrences to provide the association between different schema. Associations can also be used to express the dependencies between schema constructs — how schemas can map to existing schemas.

2:00 - 2:45 Late breaking news
Topic Maps: Designing and modelling relationships within complex content corpora
Ann M Wrightson, Sweet & Maxwell Ltd.

Because of their simplicity and uniformity, Topic Maps can be difficult to apply to complex problems in a structured manner. To alleviate this problem, this presentation offers both: a graphical notation for representing and designing topic maps, based on the core abstractions underlying the standard and examples of structuring complex interrelationships within large corpora of electronic content into distinct domains and categories, and modelling these using topic map abstractions. The examples are based on the author's work with two kinds of complex, highly interrelated content: interactive electronic technical manuals, and legal information.

2:45 - 3:30 Late breaking news
Topic Maps: Next Generation
Michel Biezunski, InfoLoom, Inc.

Topic Maps have been a widely unknown specification until recently. However, since the ISO standard (ISO/IEC 13250) was published in January 2000, it has gained remarkable momentum and many now believe it will become the next important information technology. This paper focuses on the issues relating to Topic Maps that must be addressed in order for Topic Maps to be widely adopted in today's web-centric environment.

4:45 - 5:30 Late breaking news
What's in a name? The latest controversy over namespaces
David G. Durand, Dynamic Diagrams

There has been a lot of recent discussion of the meaning of namespace declarations in the W3C and in the larger community. The key issue seems a small technical point: whether the ability to use relative URI references for namespaces in XML is a terrible mistake, and if so, what solutions might be possible. This seemingly simple question has led to more that 3000 email messages on the public discussion list without reaching a satisfactory conclusion as of the composition of this abstract. I will attempt to present the main issues fairly, without unduly favoring my own views (which I will make clear to ease the detection of bias). Whether this will be a discussion of the history of a decision, or an update on an ongoing process is as yet unclear.

Wednesday, August 16, 2000

9:45 - 10:30 Yellow Track
Beyond schemas
Scott Vorthmann, Extensibility, Inc., and Jonathan Robie, Software AG

The Schema Adjunct Framework is an XML-based language used to associate task-specific metadata with schemas and their instances, effectively extending the power of existing XML schema languages such as DTDs or XML Schema. This is useful because in many environments additional information which is typically not available in the schema itself is needed to process XML documents. Such information includes mappings to relational databases, indexing parameters for native XML databases, business rules for additional validation, internationalization and localization parameters, or parameters used for presentation and input forms. Some of this information is used for domain-specific validation, some to provide information for domain-specific processing. No schema language provides support for all the information that might be provided at this level, nor should it — instead, we suggest a way to associate such information with a schema without affecting the underlying schema language.

11:00 - 11:45 Plenary
Using UML to define XML document types
W. Eliot Kimber and John Heintz, both of DataChannel, Inc.

UML (Unified Modeling Language) models can be used to define XML document types instead of DTDs or schemas. The XML encoding of data is fundamentally an implementation representation of data that conforms to some higher-level abstract data model or object model. In this way, the use of UML to define the XML implementation of objects is exactly analogous to using UML to define the Java or CORBA or C++ or SQL (Standard Query Language) implementations of those objects. Thus there is assumed to be a more abstract data model of which the XML is an implementation, referred to as the "XML implementation representation" of the data. By using UML stereotypes to map application-specific types to XML syntactic constructs, we show how UML can be used in the case of a sample DTD to map abstract information data models to XML-specific implementation models and illustrate a sample program for generating XML DTD-syntax declaration sets from their corresponding UML models.

4:00 - 4:45 Late Breaking News Blue Track
XHub: An Online Service for Creating OEB eBooks from XML Documents
Elli Mylonas, Brown University

Brown University's Scholarly Technology Group has developed a web-based environment, based on an underlying XSLT conversion architecture, to support the creation of OEB (Open eBook Publication Structure) ebooks from XML inputs. This service allows users to perform intelligent conversions of documents in formats like XHTML, TEI, DocBook, and others, into XML eBook Publications. This presentation will describe the design of XHub, some of the interesting problems solved in the course of its development, and some broader issues related to managing real-world XML transformations. We will also describe plans to use XHub as a test bed for exploring topics such as annotation exchange.

4:45 - 5:30 Late Breaking News Blue Track
Technical implications of using the XML 1.0 standard for vertical market standards definition
Gabriel Minton, Ultraprise Corporation

During development of the Mortgage Industry Standards and Maintenance Organization (MISMO XML) standard a number of technical issues surfaced. These include, but are not limited to: Scoping and initial division of labor (based on volunteers); Process area creation and definition; Designed to span transactions, not support only one (X12); Data dictionary creation (including web software we created to aid the process); XML element creation and modeling; Elements vs. attributes ("mixed" approach); DTD vs. Schema discussion (not standardized, but architect for use in the future); Implementation of the standard; Extendible architectures; the need to build off the standard- the need for automatic normalization of data back to the standard; "Automagic" DTD creation; Where we store DTD's (relative) This is not a case study. Rather, it is the method, process, means, and architectural framework that is in use today and working in the mortgage industry in hopes that the ideas and means could be reused in other vertical industries. The problem we are trying to solve is adoption of technology and methods for building XML as a solution into an existing infrastructure and in between existing trading partners. Everyone pretty much agrees that XML is "cool", but few people know how they can start to employ it today.

Thursday, August 17, 2000

9:00 - 9:45
Yellow Track
Validating Topic Maps with constraints
Hans Holger Rath, STEP Electronic Publishing Solutions GmbH

A Topic Map can be expressed validly, in terms of the ISO/IEC 13250 standard, and yet contain information that is inconsistently or incompletely expressed. For example, creators and maintainers of large, complex Topic Maps need ways to use computers to identify trouble spots, such as topics that have been incompletely specified (where the criteria for "completeness" are arbitrary and specifiable). Possible uses of the values of "scope" attributes can be to specify value constraints that can be tested algorithmically. The extensions to the standard that make this possible are minor, and they can express several important kinds of combinatorial constraints.

9:45 - 10:30 Yellow Track
Semantic interoperability on the Web
Jeff Heflin and James Hendler, both of University of Maryland

"Semantic interoperability" — the ability to make use of information outside of its semantic universe of origin — is highly desirable because we all live in a worldwide universe of (somewhat disjunct) semantic universes. As different semantic universes increasingly share one worldwide Web, the problem of semantic interoperability becomes more urgent and less ignorable. XML is able to accommodate an unbounded number of diverse markup vocabularies, each of which makes sense in its own semantic universe, but XML, by itself, does not make semantics portable among universes. RDF (the W3C’s Resource Description Framework Recommendation) facilitates some aspects of semantic interoperability. The SHOE (Simple HTML Ontology Expressions) language has many features necessary for the expression of semantic webs, and may be better suited for semantics on the Web than either XML DTDs or RDF.

2:45 - 3:30 Yellow Track
Constructing a navigableTopic Map by inductive semantic acquisition methods
Helka Folch, Eléctricité de France, & Benoit Habert, LIMSI - Universite de Paris Sud, Orsay

Once it has been made, a well-made Topic Map can make desired information easily findable, even when the desired information is a very small part of a very large library of resources. However, the effort involved in making a useful Topic Map for a very large corpus of very diverse materials can be quite large. The Scriptorium Project of EDF (the French national electrical monopoly) makes this problem manageable using several data mining methods including ALCESTE, part of a process that subjects the text content of EDF’s enormous backlog of heterogeneous resources to statistical analysis. The semantic classes thus generated become topics in the resulting Topic Maps. A side effect of the process is the division of the library into manageable (<10 Mb) corpora, the identity of each of which is reflected in the "scope" specifications of the resulting topic characteristics.

4:00 - 4:45 Blue Track
A case for the implementation of groves in a PDM environment
Trish Laedtke, ISOGEN International

Basically, Product Data Management (PDM) systems facilitate three kinds of activity: 1) identifying new data, 2) adding value to new data, and to the entire dataset, and 3) making the data available to multiple sites in a variety of ways, As new data is put into the system, the new item is parsed and "understood". In a PDM utilizing the grove paradigm, the result of parsing/understanding is a grove, interconnected data nodes, with each node consisting of named properties, and values for those properties. Every node in every grove is fully addressable and available for every application’s purpose. There are many advantages of groves to traditional PDM processing.

Friday, August 18, 2000

9:00 - 9:45 Yellow Track
The flexible base DTD
Jan Christian Herlitz, Excosoft AB

These days it is obvious that documents must be reused for different purposes, e.g. printed on paper or published on the web. Such reuse can be accomplished using a two step process where a source document is produced according to a Base DTD in the first step and various target DTDs are applied in the second step according to the required areas of use. A Base DTD, the FlexDTD, is presented which is characterized by a free structure, generic elements, embedded typographical markup, and great simplicity.

9:45 - 10:30 Yellow Track
How to maintain a family of DTDs and keep them related using switchboards
(ppt. version located here)
Diederik A. Gerth van Wijk, Wolters Kluwer Nederland

While it is sometimes important to use a large number of DTDs in an organization, their management presents significant challenges. We have developed a technique by which content models can be loosened or tightened by using Marked Sections to control which portions of a DTD will take effect and setting the values of "INCLUDE" or "IGNORE" for the Marked Sections using parameter entities (called switches). All localization for a specific DTD is made in a switch file that overrules the default switch settings. A central switchboard controls the default settings based on the state of previously-defined switches. This on-the-fly creation makes it hard to ensure valid model groups, so the source DTDs are normalized into a single valid DTD with parameter entities resolved and empty content tokens removed from model groups.

11:45 - 12:30 Blue Track
An XML-based N-tier architecture for border management systems
Andy Adler, James MacLean, and Alan Boate, all of AiT

Border management systems need to integrate varied and numerous data (such as traveler’s documents, video surveillance images, messages, and national and international databases) in the context of varying languages, IT resources, hardware requirements, skill sets, and policies. Describes a 5-tier border management system architecture based on modular software components, XML to provide a database-neutral format and the message infrastructure between multiple tiers, and XSL for transformation and formatting.

9:45 - 10:30 Blue Track
Hypertext functionalities in XML

Fabio Vitali, University of Bologna

XMLC is a very general architecture to add sophisticated hypertext functionality to XML documents. The overall design goal is to create a complete authoring environment for sophisticated hypermedia based on the most recent protocols and languages available on the WWW. We hypothesize that XLink will be very useful for realizing the following sophisticated hypertext-related functionalities: editable browsers; storing document content and link anchors separately; external linkbases; and displaying link spans, node and link attributes. Further, we describe how they are being implemented in the current version of our XMLC browser. In fact, the architecture of XMLC can be fruitfully used for more than visualization, for it is an extremely general way to associate behaviors to XML elements, and thus to produce active documents that perform computations, enact goals, produce results.

TO FULL CONFERENCE PROGRAM


Attend a GCA ConferenceBecome a GCA MemberBuy a GCA Publication
Today's News Digest
What is XML?What is SGML?ICEGCA's Mail.dat
Technical CommitteesTechnical ResourcesTargeted InitiativesGCA's GRACol
What is GCA?GCA Press ReleasesGCA MembersGCA's ICCContact GCA
GCA - Phone: +1 703-519-8160   Click Here For Legal And Technical Information
Click Here For Legal And Technical Information email: info@gca.org