GCA

GCA

What is XML

Home Page




Attend a GCA Conference
Become a GCA Member

Buy a GCA Publication


 

XML Standards News;
HTML, XHTML, and CSS PR s Impact Web Pages

During the early fall, a number of Proposed Recommendations have come out of W3C.  These recommendations apply to HTML, which still comprises the majority of data found on the Web today.  Interestingly enough, the new PRs strengthen HTML, yet in no way compromise the growing importance of XML on the Web.

HTML 4.0 is a PR

The long awaited next generation of HTML is now a Proposed Recommendation.  But what does this mean exactly?  With the emergence of XML, why would we need improvements to HTML?  Wasn t XML designed to be the next generation HTML? 

Well, it turns out that most agree that both HTML and XML have a place on the Web.  HTML is still the foremost language for the encoding of Web display.  XML on the other hand, is designed for the coding and interchange of structured data on the Web.  Never mind that we can now view XML directly in Web browsers!  HTML remains the preferred language for publishing content on the Web.

So what s new about HTML 4.0?  First there are some general editorial changes to the HTML specification.  These include the addition of style sheets for the document based on W3C technical report styles, the addition of a short table of contents, updates in the acknowledgments, and references to the document character set are all ISO 10646, and one time to Unicode to signal equivalence.

But some technical changes/improvements have been made as well:

  • Table Cells: The definitions of rowspan and colspan changed. Now spans are bounded by groups (rowgroups or colgroups).  Also when "char=align" not supported by the user agent, behavior is undefined.
  • Anchor Elements: It is legal for "name" and "id" to appear in the same start tag when they are both defined for an element. They must have identical values.
  • Image Elements: Addition of the name attribute for backwards compatibility. A note that user agents must provide different mechanisms for accessing the "longdesc" URI (of IMG) and the "src" URI (of A) when an IMG is part of the content of an A element was added.
  • Image, Object, and Applet Elements: The vspace and hspace attribute definitions now look like the definitions of other attributes.  The type of vspace, hspace, and border attribute values was changed from "length" to "pixels".
  • Form Elements: Addition of the name attribute for backwards compatibility.
  • Client-side Map Elements: The content model of the Map element now allows authors to mix Area content and block-level content.
  • Attribute Values: Section 3.2.2 states that attribute values may now contain colons and underscores as well.

HTML Version 4.0 (subversion 4.01), specifies the text, multimedia, and hyperlink features of the previous versions of HTML, HTML 4.01 supports more multimedia options, scripting languages, style sheets, better printing facilities, and documents that are more accessible to users with disabilities. HTML 4.01 also takes great strides towards the internationalization of documents with the incorporation of UNICODE.  HTML 4.01 is an SGML application conforming to International Standard ISO 8879 -- Standard Generalized Markup Language.  An XML version of this Web publishing tag set has also been developed.  See XHTML!

XHTML Brings Two Worlds Together

We have HTML 4.0.  And we have XML!  So what is XHTML?  Is this something new?  How does it differ from HTML?  Is XHTML really XML or HTML??

XHTML 1.0 is a reformulation of HTML 4.0 as an XML 1.0 application, and three DTDs corresponding to the ones defined by HTML 4.0. The semantics of the elements and their attributes are defined in the W3C Recommendation for HTML 4.0. These semantics provide the foundation for future extensibility of XHTML. Compatibility with existing HTML user agents is possible by following a small set of guidelines.

So XHTML is both HTML and XML.  It is in the simplest form an XML version of HTML.  This means that XHTML is well formed HTML.  It has all the HTML 4.0 tags.  But the rules of XML well-formedness apply.  So XHTML tags must be correctly nested.  And XHTML tags must have both start and end tags.  In HTML it is often acceptable to mark paragraph starts with the <p> tag and not required the end tag.  But in XHTML both the start and end paragraph tags will be required.  Also in HTML we can use the same syntax for tags with content (such as paragraph or headings) and tags that are empty, or have no content (such as <br> or <img>).  But in XHTML, we need to use the special empty tag syntax to indicate that these tags have no content and no ending tag.  This means that <br> will become <br/>.

Now we might ask, what is the sense of having an XML version of the HTML tag set?   The answer can be twofold.  First, well-formed HTML can lead to lighter-weight browser/client side software.  This is a definite advantage when browsing with new highly portable devices.  Second, if our HTML is well formed XML, we can follow the rules of XML to provide content extensions.  That is, XHTML provides us with extensible HTML!

Paged Media Specifications in CSS3

A new Working Draft proposes an extension to CSS to permit finer control over the paged presentation, both printed and online, of Web pages. This proposal includes new properties for describing headers, footers, footnotes and endnotes. In addition cross-references and page-based counters, and page-dependent floating elements are described in this document.  It is important to note that all these new CSS properties are for data constructs usually found on the printed page, not on Web pages.  Clearly CSS is making the transition to providing media independent presentation for HTML.  With these paged media extensions, HTML will not be just for the Web anymore.  These features require other features described here, such as cross-references and page-based counters. In addition, page-dependent floating elements are described in this proposed extension.

 

Return to TOC

Today's News DigestWhat is XML?What is SGML?ICEGCA's Mail.dat
Technical CommitteesTechnical ResourcesTargeted InitiativesGCA's GRACol
What is GCA?GCA Press ReleasesGCA MembersContact GCA


GCA - Phone: +1 703-519-8160   Click Here For Legal And Technical Information
Click Here For Legal And Technical Information email: info@gca.org