GCA
GCA What is XML

Home Page



Attend a GCA Conference
Become a GCA Member

Buy a GCA Publication


 

Progress on XML Schema Language

This fall W3C reported progress on the most-awaited new standard since XML.  That is the new XML schema language.  Progress came in the form of two new working drafts posted in late September.  It is important to note that these documents are a Working Draft.  The facilities described in the documents are in a preliminary state of design. The Working Group anticipates substantial changes, both in the mechanisms described herein, and in additional functions yet to be described. “The Schema WG will not allow early implementation to constrain its ability to make changes to this specification prior to final release.

XML Schema Part I: Structures

This Working Draft is Part 1 of a two-part draft of the specification for the XML Schema definition language. This document proposes facilities for describing the structure and constraining the contents of XML 1.0 documents. The schema language, which is itself represented in XML 1.0, provides a superset of the capabilities found in XML 1.0 document type definitions (DTDs).  The purpose of the document is to provide an inventory of XML markup constructs with which to write schemas. An XML schema is used to define and describe a class of XML documents by using these constructs to constrain and document the meaning, usage and relationships of their constituent parts: datatypes, elements and their content, attributes and their values, entities and their contents and notations. Schema constructs may also provide for the specification of additional information such as default values. Schemas are intended to document their own meaning, usage, and function through a common documentation vocabulary. Thus, XML Schema: Structures can be used to define, describe and catalogue XML vocabularies for classes of XML documents.

The Working Draft begins by presenting the conceptual framework for schemas.  The document provides an introduction to schema constraints, types, schema composition, and symbol spaces.  The specification then reconstructs the core functionality of XML 1.0 and also adds the ability to define new kinds of constraints such as new constraints on the content of elements or attributes.

Here is an example from the new Working Draft.  In this example we are defining a schema for a purchase order.  Note that this example looks very much like a DTD, except that we are using tags defined in the XML schema language instead of using DTD syntax:

<schema targetNS='http://www.myOrg.com/bob/PurchaseOrder' xmlns='http://www.w3.org/1999/09/24-xmlschema'>
 
 <element name='PurchaseOrder' type='PurchaseOrderType'/>
 
 <element name='comment' type='string'/>
 
 <archetype name='PurchaseOrderType'>
  <element name='shipTo' type='Address'/>
  <element name='shipDate' type='date'/>
  <element ref='comment' minOccurs='0'/>
  <element name='Items' type='Items'/>
  <attribute name='orderDate' type='date'/>
 </archetype>
 
 <archetype name='Address'>
  <element name='name' type='string'/>
  <element name='street' type='string'/>
  <element name='city' type='string'/>
  <element name='state' type='string'/>
  <element name='zip' type='number'/>
  <attribute name='type' type='string'/>
 </archetype>
 <archetype name='Items'>
   <element name='Item' minOccurs='0' maxOccurs='*'>
   <archetype>
    <element name='productName' type='string'/>
    <element name='quantity'>
     <datatype>
      <basetype name='integer'/>
      <minExclusive>0</minExclusive>
     </datatype>
    </element>
    <element name='price' type='number'/>
    <element ref='comment' minOccurs='0'/>
   </archetype>
  </element>
 </archetype></schema>

Note that while we find the expected ways to define elements, element occurence and frequency, and attributes, you also see new constructs such as archetypes and datatypes.  These are designed to provide “power” beyond that found in XML DTDs to define constraints that can be validated.

XML Schema Part II: Datatypes

This Working Draft is Part 2 of a two-part draft of the specification for the XML Schema definition language.  This document specifies constraints on datatypes.  The specification begins with a description of what a datatype is.  An invoice is used as an example:

<invoice>   
   <orderDate>19990121</orderDate>   
   <shipDate>19990125</shipDate>   
   <billingAddress>   
      <name>Ashok Malhotra</name>   
      <street>123 IBM Ave.</street>   
      <city>Hawthorne</city>   
      <state>NY</state>   
      <zip>10532-0000</zip>   
   </billingAddress>   
   <voice>555-1234</voice>   
   <fax>555-4321</fax>   
</invoice>

 

The invoice contains several dates and telephone numbers, the postal abbreviation for a state (which comes from an enumerated list of sanctioned values), and a ZIP code (which takes a definable regular form).  With XML DTDs we have no way of validating the data within each field.  The goal of XML Schema Datatype is to do just that!

“The limited datatyping facilities in XML have prevented validating XML processors from supplying the rigorous type checking required in these situations. The result has been that individual applications writers have had to implement type checking in an ad hoc manner. This specification addresses the need of document authors and applications writers for a robust, extensible datatype system for XML, which could be incorporated into XML processors. As discussed below, these datatypes could be used in other XML-related standards as well. “

In the Working Draft, a datatype is defined as a 3-tuple, consisting of a) a set of distinct values, called its value space, b) a set of lexical representations, called its lexical space, and c) a set of facets that characterize properties of the value space, the lexical space or of individual values or lexical items.

  • Value space: is the set of permitted values for a given datatype
  • Lexical space: is the set of valid literals for a given datatype
  • Facets:  is the set of defining aspects of a given datatype

The framework presented in this working draft has been influenced by the ISO 11404 standard on language-independent datatypes as well as the datatypes for SQL and for programming languages such as Java.

Return to TOC

Today's News DigestWhat is XML?What is SGML?ICEGCA's Mail.dat
Technical CommitteesTechnical ResourcesTargeted InitiativesGCA's GRACol
What is GCA?GCA Press ReleasesGCA MembersContact GCA


GCA - Phone: +1 703-519-8160   Click Here For Legal And Technical Information
Click Here For Legal And Technical Information email: info@gca.org