|
An overview of NewsML
|
 |
NewsML is an XML based, media independent, structural framework for
news. It is capable of representing news in all the various stages of its
lifecycle in an electronic service environment - e.g.
- in and between editorial systems;
- between wire service providers and media clients;
- between original publishers and aggregators / syndicators;
- between news service providers and ultimate consumers of news.
NewsML is intended for use in electronic news production, archiving
and delivery and as such does not specifically set out to meet the needs of
paper-based news publishing. It is intended that NewsML is able to include
features required for paper-based publishing and other specific production
environments by including external definitions designed for this purpose.
NewsML is not necessarily intended as a format for editing or creation
of news, though it is intended as the basis for such formats. It is recognised
that the specific demands of individual organizations and production systems
will require proprietary extensions to the base NewsML to be effective in
this role.
NewsML features
NewsML is media independent, and allows equally for the representation
of the evening TV news and a simple textual news-item. NewsML’s principal
features are:
- All formats and media types recognised equally;
- Collections of news-items;
- Named relationships between news-items
- Structure consisting of parts and named relationships between parts
Facilitates the development of news-items;
- Alternative representations of the same part
- Explicit inclusion, inclusion by reference and exclusion of parts
and alternatives
- Attachment of metadata from standard and non-standard schemes
All formats and media types recognised equally
NewsML makes no assumption about the media type, format or encoding
of news. It provides a structure within which pieces of news media relate
to each other. NewsML can equally contain text, video, audio, graphics and
photos.
In recognition of the media-neutrality of NewsML, its principal components
are called news-items. In media specific contexts these components are at
present referred to as stories, articles, reports and so on.
While recognizing that the preponderance of historical material is textual,
and requires optimal handling, it is also recognized that media other than
text are of increasing importance. NewsML takes the view that any medium can
be the main part of a news-item and that objects of all other types can fulfill
secondary, tertiary and other roles in respect of the main part. Hence NewsML
allows for the representation of simple textual news-items, textual news-items
with primary and secondary photos, the evening TV news - with embedded individual
reports, and so on.
Collections of news-items
It is a characteristic feature of news-items that they are often used
and presented as collections. NewsML supports such collections irrespective
of how they were composed, or whether they were composed with journalistic
intent (e.g. top insurance stories), or whether they were composed in a more
arbitrary way (e.g. as the result of a users query or a corpus of news-items
from a particular date range).
Named relationships between news items
NewsML allows the construction of relationships between news-items and
collections of news-items, such as "see also", "related news", "for more detail"
and so on. The construction of such relationships implies that over time news-items
will exist in a web of such named relationships. The relationships may be
temporary, permanent or dynamically changing.
Structure consisting of parts and named relationships between parts
An increasingly wide variety of complementary media are being used together
as a single piece of news. Use of text with supporting images is well understood
as a format for conveying news; but increasingly other mixtures of media-types
are needed.
In NewsML news-items consist of parts, which have a named role in relation
to their containing news-item. Many news-items will have a "main" part and
a number of secondary and tertiary parts which complement the main part in
various ways. In a simple case a news-item may consist of a textual main part
and a number of photos as secondary parts.
Just as news-items can exist in named relationships to other news-items,
so news-items may also form parts of each other. NewsML does not define the
difference between news-items and parts of news-items, the distinction is
a functional one - parts of news items may only be referred to in the context
of the news-item that they compose, whereas news-items can be referred to
in a globally unique and unambiguous way. The editorial policy of the originating
organization will determine what exactly constitutes a news-item.
NewsML describes the logical structure and metadata of news-items, but
does not impose any layout semantics. Where it is important to attach layout
to an individual part, or to the way parts lay out relative to each other,
this is achieved using standard means.
Facilitates the development of news-items over time
NewsML provides strong revisioning support to allow subsequent revisions
of news-items to modify earlier revisions in a way that is robust and unambiguous.
Hence NewsML supports the development of news-items over time, and in particular
allows for the development of textual stories using takes.
Recognizing that different components of a news-item have different
production delays, NewsML provides robust support for the attachment of parts
that are available earlier to parts that become available later. For example,
the text and audio components of a news-item may be available earlier than
the video components.
Alternative representations of the same part
NewsML provides the ability to represent the same part in a number of
different ways.
Users of NewsML in a delivery environment can select which alternative
representation is most appropriate to their delivery context. So, for example,
textual parts of a news item might be available in HTML, RTF and PDF versions.
Photos might be available in different resolutions colour depths and might
have GIF or JPEG encoding variants.
NewsML does not insist that the alternatives to a part of a news-item
are of the same media type. The main part of the news-item, for example, might
be available as video (for delivery in a high bandwidth environment) but also
as text (for delivery in a wireless environment or as a preview).
Explicit inclusion, inclusion by reference and exclusion of parts and
alternatives
NewsML is meant to be appropriate for use in a variety of delivery environments,
including but not limited to:
- one-way streaming as in a traditional newswire;
- request / response as in an on-line environment;
- bulk shipping of archive material.
NewsML provides the ability to represent the same news-item in
a number of different ways to accommodate differences in the capabilities
of delivery environment. As mentioned above, this is achieved in part by allowing
alternative renderings of the same material in different sizes and media types.
Another aspect of accommodating different environments is provided by
the ability to allow the same news-item to be represented in different ways,
for example when a news-item is to be transmitted a decision can be made between
the three options of:
- include all components in the transmission;
- include references to allow the receiving application to retrieve
material if it is required;
- exclude certain types of component that are never required in a
particular application (one criterion for exclusion could be language).
NewsML does not specify the mechanisms by which applications choose
to have components of a news-item included, referenced or excluded. Such mechanisms
will be defined by each service provider according to the nature of the service
provided.
Attachment of metadata from standard and non-standard schemes
Each component of a news-item and the news-item as a whole can have
metadata attached to it. Such metadata allows the description of physical
properties of the component, information about editorial and intellectual
property aspects of the component (author, publisher, owner etc) and information
about the content (what is its about, to whom it may be of interest, its generality,
its importance).
NewsML does not prescribe which schemes are to be used to describe content,
and hence allows the attachment of metadata from both proprietary and standardised
schemes.
To facilitate inter-working the IPTC defines and maintains certain schemes
as preferred schemes for use in NewsML – e.g. the IPTC subject reference
system.