An overview of NewsML
Jo Rabin
Find


Abstract
NewsML is an XML based, media independent, structural framework for news. It is capable of representing news in all the various stages of its lifecycle in an electronic service environment - e.g.
  • in and between editorial systems;
  • between wire service providers and media clients;
  • between original publishers and aggregators / syndicators;
  • between news service providers and ultimate consumers of news.
NewsML is intended for use in electronic news production, archiving and delivery and as such does not specifically set out to meet the needs of paper-based news publishing. It is intended that NewsML is able to include features required for paper-based publishing and other specific production environments by including external definitions designed for this purpose.
NewsML is not necessarily intended as a format for editing or creation of news, though it is intended as the basis for such formats. It is recognised that the specific demands of individual organizations and production systems will require proprietary extensions to the base NewsML to be effective in this role.

Contents
  1. NewsML features
    1. All formats and media types recognised equally
    2. Collections of news-items
    3. Named relationships between news items
    4. Structure consisting of parts and named relationships between parts
    5. Facilitates the development of news-items over time
    6. Alternative representations of the same part
    7. Explicit inclusion, inclusion by reference and exclusion of parts and alternatives
    8. Attachment of metadata from standard and non-standard schemes

NewsML features
NewsML is media independent, and allows equally for the representation of the evening TV news and a simple textual news-item. NewsML’s principal features are:
All formats and media types recognised equally
NewsML makes no assumption about the media type, format or encoding of news. It provides a structure within which pieces of news media relate to each other. NewsML can equally contain text, video, audio, graphics and photos.
In recognition of the media-neutrality of NewsML, its principal components are called news-items. In media specific contexts these components are at present referred to as stories, articles, reports and so on.
While recognizing that the preponderance of historical material is textual, and requires optimal handling, it is also recognized that media other than text are of increasing importance. NewsML takes the view that any medium can be the main part of a news-item and that objects of all other types can fulfill secondary, tertiary and other roles in respect of the main part. Hence NewsML allows for the representation of simple textual news-items, textual news-items with primary and secondary photos, the evening TV news - with embedded individual reports, and so on.
Collections of news-items
It is a characteristic feature of news-items that they are often used and presented as collections. NewsML supports such collections irrespective of how they were composed, or whether they were composed with journalistic intent (e.g. top insurance stories), or whether they were composed in a more arbitrary way (e.g. as the result of a users query or a corpus of news-items from a particular date range).
Named relationships between news items
NewsML allows the construction of relationships between news-items and collections of news-items, such as "see also", "related news", "for more detail" and so on. The construction of such relationships implies that over time news-items will exist in a web of such named relationships. The relationships may be temporary, permanent or dynamically changing.
Structure consisting of parts and named relationships between parts
An increasingly wide variety of complementary media are being used together as a single piece of news. Use of text with supporting images is well understood as a format for conveying news; but increasingly other mixtures of media-types are needed.
In NewsML news-items consist of parts, which have a named role in relation to their containing news-item. Many news-items will have a "main" part and a number of secondary and tertiary parts which complement the main part in various ways. In a simple case a news-item may consist of a textual main part and a number of photos as secondary parts.
Just as news-items can exist in named relationships to other news-items, so news-items may also form parts of each other. NewsML does not define the difference between news-items and parts of news-items, the distinction is a functional one - parts of news items may only be referred to in the context of the news-item that they compose, whereas news-items can be referred to in a globally unique and unambiguous way. The editorial policy of the originating organization will determine what exactly constitutes a news-item.
NewsML describes the logical structure and metadata of news-items, but does not impose any layout semantics. Where it is important to attach layout to an individual part, or to the way parts lay out relative to each other, this is achieved using standard means.
Facilitates the development of news-items over time
NewsML provides strong revisioning support to allow subsequent revisions of news-items to modify earlier revisions in a way that is robust and unambiguous. Hence NewsML supports the development of news-items over time, and in particular allows for the development of textual stories using takes.
Recognizing that different components of a news-item have different production delays, NewsML provides robust support for the attachment of parts that are available earlier to parts that become available later. For example, the text and audio components of a news-item may be available earlier than the video components.
Alternative representations of the same part
NewsML provides the ability to represent the same part in a number of different ways.
Users of NewsML in a delivery environment can select which alternative representation is most appropriate to their delivery context. So, for example, textual parts of a news item might be available in HTML, RTF and PDF versions. Photos might be available in different resolutions colour depths and might have GIF or JPEG encoding variants.
NewsML does not insist that the alternatives to a part of a news-item are of the same media type. The main part of the news-item, for example, might be available as video (for delivery in a high bandwidth environment) but also as text (for delivery in a wireless environment or as a preview).
Explicit inclusion, inclusion by reference and exclusion of parts and alternatives
NewsML is meant to be appropriate for use in a variety of delivery environments, including but not limited to: NewsML provides the ability to represent the same news-item in a number of different ways to accommodate differences in the capabilities of delivery environment. As mentioned above, this is achieved in part by allowing alternative renderings of the same material in different sizes and media types.
Another aspect of accommodating different environments is provided by the ability to allow the same news-item to be represented in different ways, for example when a news-item is to be transmitted a decision can be made between the three options of:
NewsML does not specify the mechanisms by which applications choose to have components of a news-item included, referenced or excluded. Such mechanisms will be defined by each service provider according to the nature of the service provided.
Attachment of metadata from standard and non-standard schemes
Each component of a news-item and the news-item as a whole can have metadata attached to it. Such metadata allows the description of physical properties of the component, information about editorial and intellectual property aspects of the component (author, publisher, owner etc) and information about the content (what is its about, to whom it may be of interest, its generality, its importance).
NewsML does not prescribe which schemes are to be used to describe content, and hence allows the attachment of metadata from both proprietary and standardised schemes.
To facilitate inter-working the IPTC defines and maintains certain schemes as preferred schemes for use in NewsML – e.g. the IPTC subject reference system.
Previous Previous Table of Contents