|
NewsML
a news markup and management tool in XML
|
 |
In autumn 1999 the IPTC embarked upon an ambitious new project to develop
an XML based standard called NewsML. NewsML is an XML encoding for news which
is intended to be used for the creation, transfer and delivery of news. NewsML
is media independent, and allows equally for the representation of the evening
TV news and a simple textual story. The need for this new standard was driven
by IPTC members who had decided to adopt XML as a core technology for their
future business needs. The requirements developed for NewsML converged with
the emerging XML family of standards. The project was heavily time constrained
and required decisions to be made on how far adopt standards that were not
finalized. It also needed compromises to be reached between organizations
with different business priorities as well as pragmatic decisions on the availability
and usability of XML tools. This presentation explains the processes adopted
to achieve a successful conclusion, explains some of the problems encountered
along the way and reveals how successful the project has been to date.
NewsML: an news industry standard for the new millennium
Background. The IPTC has been developing standards
for the interchange of news material since the late 1970s. Initial work focussed
on text messages but was followed by a digital newsphoto format in the late
1980s. By the mid 1990s a new text format, initially based on SGML, was released.
This is called the news Industry text Format (NITF). Following the announcement
of XML version 1 in early 1998 the IPTC converted NITF to an XML document
type definition and this now enjoys widespread usage.
The On-line Dimension. With the move towards
the adoption of more open standards for the news industry, particularly as
a result of the very rapid growth of online publishing on the world wide web,
the IPTC decided to base all its future standards activity on XML. This decision
was endorsed in mid 1999 and a new programme of work commenced in October
1999 with the introduction of the NewsML concept. prior to NewsML the IPTC
standards had only allowed for one instance of a single media type to be exchanged
in the containing envelope. This envelope carried both routing, identification,
descriptive and editorial metadata but required bespoke applications to open
and parse the information. Publishing outside of the traditional print medium
called for news many formats and also required the provider to explicitly
declare the relationships between the different media objects related to a
particular news event.
Timescale. It was this pressure for multimedia,
multi-object presentation of news that initiated the new work programme of
IPTC2000 with at its heart the NewsML description. NewsML was seen as a news
exchange and management tool that could be used throughout the life cycle
of news from first assignment to final archival. This was a challenging requirement
and extended the boundaries of the of the new standard beyond ant previous
IPTC work. Above all, it was seen that market forces required a viable version
1 of the document type to be available by mid 2000 when it could be ratified
at the next IPTC Annual General Assembly. Fortunately, the IPTC already had
a significant intellectual investment in its previous envelope standard the
Information Interchange Model and so an evolutionary approach could be adopted
to develop NewsML.
Working Method. The normal method of working
for IPTC members is to form specialized Working parties to address matters
in detail and then to have these actions reviewed and endorsed by a plenary
body known as the Standards Committee. This 2 stage approach allows small
groups to progress rapidly while allowing the membership as a whole to have
an input into the emerging standard. Above all, as an international body,
IPTC has always had the aim of building a consensus for any of its publishable
standards. As it had been decided to put NewsML in the public domain as soon
as possible the development of such a consensus was more urgent that for previous
standards. To further gain feedback from a larger audience it was decide,
again for the first time, to establish a public mailing list for exchange
of ideas pertinent to the NewsML work. Within IPTC 3 Working Parties were
established to look at the news Structure and Management, the News Text markup
and News Metadata. The 3 Chairman were appointed, based on their areas of
expertise and each of the Working parties was able to draw, in part from the
previous IPTC work. Relevant earlier activities included the Information Interchange
Model (IIM), The Digital News Parameter Format (DNPR), the Common Linking
Implementation Procedure (CLIP) and the News Industry Text Format (NITF).
News Structure & Management. This activity
has the widest remit and is the most challenging. Not only were new concepts
involved, particularly the ideas or roles and named relationships between
different news objects but there was also the management issues to be considered
in the evolving news scenario where relevance and accuracy are all important.
In particular, the family of XML related standards is still not mature and
trade-offs need to be made between tool usability, requirements and the adoption
of unratified standards that could still change before final release. It was
felt that, to assist in making such value judgements, outside expertise would
be of great value
News Text. Written News can come in a variety
of sizes from the one line announcement to a fully developed feature article
or even a ready made newspaper page. There was concern that markup overheads
would not unduly burden the smaller items, but at the same time it was necessary
to provide sufficient richness in markup to serve the needs of the more developed
articles. This was an area where the experience of the NITF was to prove valuable
in allowing IPTC to draw from the experiences of NITF implementors.
News Metadata. For over 20 years the IPTC has
been in the process of identifying and specifying what we now call metadata
for the exchange of news. The initial focus was on text but has since been
followed by photographs and audio as the members businesses have evolved.
As it is hoped that NewsML will be widely adopted throughout the (news) publishing
industries it was felt necessary to gain inputs from other areas than that
represented by most of the IPTC members. The areas identified as being of
specific concern are that of magazine publishing and video. There is also,
the all encompassing dimension of Copyright and rights that concerns everyone
who creates or publishes material in whatever media is relevant. In order
to gain the appropriate expertise we have decided to work cooperatively with
both the Graphic Communications Association PRISM initiative and the ISO MPEG-7
committees. Both these organizations will allow is to improve our metadata
coverage and relevance in areas that we could not expect to cover from our
own resources.
Progress Achieved. As at the time of writing
the IPTC has formally endorsed the NewsML Requirements documentation and is
actively working on the associated Functions documentation. The 3 working
parties have already made significant progress and external Consultant effort
has been brought in to assist in some of the trade off and technical decisions.
Contact has been made with PRISM and MPEG-7 and collaboration frameworks established.
We are still confident that a version 1.0 DTD can be made available by July
2000.
Acknowledgements
The author wishes to thank the members and directors of IPTC for permission
to publish this paper. NewsML is a trademark of the IPTC.