 Progress on XML Schema
Language
This fall W3C reported
progress on the most-awaited new standard
since XML. That is the new XML
schema language. Progress came in
the form of two new working drafts posted
in late September. It is important
to note that these documents are a Working
Draft. The facilities described
in the documents are in a preliminary
state of design. The Working Group
anticipates substantial changes, both in
the mechanisms described herein, and in
additional functions yet to be described.
The Schema WG will not allow
early implementation to constrain its
ability to make changes to this
specification prior to final release.
XML
Schema Part I: Structures
This Working Draft is Part 1
of a two-part draft of the specification
for the XML Schema definition language.
This document proposes facilities for
describing the structure and constraining
the contents of XML 1.0 documents. The
schema language, which is itself
represented in XML 1.0, provides a
superset of the capabilities found in XML
1.0 document type definitions (DTDs).
The purpose of the document is to provide
an inventory of XML markup constructs
with which to write schemas. An XML
schema is used to define and describe a
class of XML documents by using these
constructs to constrain and document the
meaning, usage and relationships of their
constituent parts: datatypes, elements
and their content, attributes and their
values, entities and their contents and
notations. Schema constructs may also
provide for the specification of
additional information such as default
values. Schemas are intended to document
their own meaning, usage, and function
through a common documentation
vocabulary. Thus, XML Schema:
Structures can be used to define,
describe and catalogue XML vocabularies
for classes of XML documents.
The Working Draft
begins by presenting the conceptual
framework for schemas. The document
provides an introduction to schema
constraints, types, schema composition,
and symbol spaces. The
specification then reconstructs the core
functionality of XML 1.0 and also adds
the ability to define new kinds of
constraints such as new constraints on
the content of elements or attributes.
Here is an example
from the new Working Draft. In this
example we are defining a schema for a
purchase order. Note that this
example looks very much like a DTD,
except that we are using tags defined in
the XML schema language instead of using
DTD syntax:
<schema targetNS='http://www.myOrg.com/bob/PurchaseOrder' xmlns='http://www.w3.org/1999/09/24-xmlschema'>
<element name='PurchaseOrder' type='PurchaseOrderType'/>
<element name='comment' type='string'/>
<archetype name='PurchaseOrderType'>
<element name='shipTo' type='Address'/>
<element name='shipDate' type='date'/>
<element ref='comment' minOccurs='0'/>
<element name='Items' type='Items'/>
<attribute name='orderDate' type='date'/>
</archetype>
<archetype name='Address'>
<element name='name' type='string'/>
<element name='street' type='string'/>
<element name='city' type='string'/>
<element name='state' type='string'/>
<element name='zip' type='number'/>
<attribute name='type' type='string'/>
</archetype>
<archetype name='Items'>
<element name='Item' minOccurs='0' maxOccurs='*'>
<archetype>
<element name='productName' type='string'/>
<element name='quantity'>
<datatype>
<basetype name='integer'/>
<minExclusive>0</minExclusive>
</datatype>
</element>
<element name='price' type='number'/>
<element ref='comment' minOccurs='0'/>
</archetype>
</element>
</archetype></schema>
Note
that while we find the expected ways to
define elements, element occurence and
frequency, and attributes, you also see
new constructs such as archetypes and
datatypes. These are designed to
provide power beyond that
found in XML DTDs to define constraints
that can be validated.
XML Schema Part II: Datatypes
This
Working Draft is Part 2 of a two-part
draft of the specification for the XML
Schema definition language. This
document specifies constraints on
datatypes. The specification begins
with a description of what a datatype is.
An invoice is used as an example:
<invoice>
<orderDate>19990121</orderDate>
<shipDate>19990125</shipDate>
<billingAddress>
<name>Ashok Malhotra</name>
<street>123 IBM Ave.</street>
<city>Hawthorne</city>
<state>NY</state>
<zip>10532-0000</zip>
</billingAddress>
<voice>555-1234</voice>
<fax>555-4321</fax>
</invoice>
The
invoice contains several dates and
telephone numbers, the postal
abbreviation for a state (which comes
from an enumerated list of sanctioned
values), and a ZIP code (which takes a
definable regular form). With XML
DTDs we have no way of validating the
data within each field. The goal of
XML Schema Datatype is to do just that!
The limited
datatyping facilities in XML have
prevented validating XML processors from
supplying the rigorous type checking
required in these situations. The result
has been that individual applications
writers have had to implement type
checking in an ad hoc manner. This
specification addresses the need of
document authors and applications writers
for a robust, extensible datatype system
for XML, which could be incorporated into
XML processors. As discussed below, these
datatypes could be used in other
XML-related standards as well.
In
the Working Draft, a datatype is
defined as a 3-tuple, consisting of a) a
set of distinct values, called its value
space, b) a set of lexical
representations, called its lexical
space, and c) a set of facets that
characterize properties of the value
space, the lexical space or of individual
values or lexical items.
- Value
space: is the set of
permitted values for a given
datatype
- Lexical
space: is the set of valid
literals for a given datatype
- Facets:
is the set of defining aspects of
a given datatype
The
framework presented in this working draft
has been influenced by the ISO 11404
standard on language-independent
datatypes as well as the datatypes for
SQL and for programming languages such as
Java.

Return to TOC
|