Implementing an XML Content Management System for Drug Information
ABSTRACT
XML and database technology combines to produce XML e-content that powers Internet, intranet and decision-support systems as well as traditional publishing to print.
Table of Contents
1. Introduction
The British National Formulary (BNF) is published jointly by the British Medical Association (BMA) and the Royal Pharmaceutical Society of Great Britain (RPSGB). Editorial work is carried out at the RPSGB and is subsequently validated by a Joint Formulary Committee comprised of expert members of the medical and pharmaceutical professions. The BNF is published every six months and is delivered in print and a range of electronic formats. BNF content is also integrated into clinical support systems. Current electronic deliverables are:
-
BNF.org, the BNF website
-
Intranet BNF
-
Local formulary toolkit (allowing local customisation of content)
-
Raw data for integration with selected third party systems
For several years the RPSGB has been publishing the BNF from a relational database using a proprietary format publishing package. The database has also been used to supply third party information system vendors with data in a variety of formats, including HTML and tab-delimited database records. This presentation describes a new XML-based publishing system which supports the existing outputs, both printed and electronic, and opens up new opportunities for XML-based products. Future deliverables which will be supported by XML structured content include:
-
PDA version of BNF
-
Fine-grained data to support electronic prescribing and advanced decision support
2. Requirements
The decision to move to an XML-based production system was taken soon after the publication of the XML 1.0 standard in 1998. Early requirements analysis provided an abstract description of what was required and publishing staff at the RPSGB observed that this looked a lot like SGML - a technology they had hitherto regarded as too complex to implement with the limited resources available. XML offered the promise of being easier to implement than SGML.
The most important requirement was to deliver content as both documents and data, and XML allowed the RPSGB to handle the content in both these modes. Adoption of XML as a standard in the UK's National Health Service (NHS) wasn't essential to the original plans because XML allowed transformation to other electronic data formats fairly easily. However, XML has since been embraced whole-heartedly by the NHS, driven by the UK Government's e-GIF initiative, which mandates the use of XML and web-based technologies.
Production of the BNF is intended to facilitate:
-
re-use of content from one publication cycle to the next and between different publications
-
interoperability between the BNF and other electronic publications
-
extension of the data scheme for use in more advanced, computer driven decision support
-
development of technology-independent knowledge resources
-
cultural change from a publication oriented way of working to a knowledge orientation
-
promotion of standard representations of drug information
3. XML Production System
3.1. Technical Overview
The existing relational database model has now been extended and modified to combine with an XML schema (DTD), so that XML can be used as the primary medium for editing the BNF content and supplying data to third-party vendors. The relational model combines domain-specific elements from the XML schema, domain-specific meta data webs and editorial meta data used to control the production process.
The new editorial system has been built on top of the relational database as a Java application, using JDBC for database access and an XML-enabled web browser for navigation and viewing of XML fragments. The Java application has in-built form-based editing capability and also launches an integrated XML editor for editing of larger XML fragments that are stored as 'chunks' in the relational database. The Java classes for packing and unpacking XML to/from the relational database use freely available XML parsing utilities and JDBC for reading and writing the database. For print production, the complete XML publication is extracted from the database and fed to a typesetting system.
All of the components of the editorial system understand XML and most of them work on both Windows and UNIX platforms, which (on the whole) allows editorial teams to carry on using their preferred operating systems. The only Windows-dependent component is Microsoft IE5, which is used to view the contents of a publication. In theory, this could be replaced by an XML viewing component which operated across platforms, or could be migrated to a server-based transformation to HTML.
3.2. Migration of the Production Process
The BNF is a mission critical publication for the Royal Pharmaceutical Society, with a new edition published every six months, in a continuous editorial and publication cycle. For this reason careful migration planning was required as the new system was developed and this heavily influenced the choice of technology used to implement the system. XML was chosen as the best format for development of the information and for easy dissemination to the downstream delivery formats (print, web and database).
The XML DTD was developed to ease the migration from the existing database model to the enhanced model. The DTD was designed in consultation with the BNF editorial and production team and in parallel with the new database model. An automated extraction and conversion process was then developed to extract data in the existing database to XML documents. These documents were then used to set up and test the pagination engine and to load data to the new database schema. The use of XML, existing database licenses, freely available Java class libraries for XML and in-house development resources has enabled the step-by-step migration of the existing publishing process to a new system based on open standards in a cost effective way, with low risk to the mission-critical process.
3.3. Current Status
The XML content management system has been rolled out to the BNF editorial team and they are now working on validating the conversion from their previous mark-up scheme. In addition to current Pharmaceutical Press publications - most of which are now accessible through the system, other publications have been identified for addition to the database, including the WHO World Model Formulary and Stockley's Drug Interactions. The XML DTD is also being extended to support forthcoming pharmaceutical publications.
The RPSGB are looking at a number of synergies created by this new-found interoperability between internal resources and are working on integration with third party resources such as the ABPI compendium.
3.4. What problems were encountered along the way?
It took rather longer than expected to get the project moving, particularly as some of the specifications involved (particularly XLink) were not completely stable at the start. It also turned out a bit more expensive than expected - low cost, and useable, XML tools didn't emerge as quickly as was hoped at the start of the project.
With hindsight, it can be said that there was insufficient planning of the roll-out stage, even though this was identifed as the most risky part of the project. Although XML may simplify many issues in a project, many of the traditional software development challenges remain, including the problems inherent in migrating seamlessly from one system to another, whilst ensuring continuous operation throughout.
At the start of the project there was no standard model for the representation of drug information - so the RPSGB had to predict the future with regard to vertical standards for interoperability with third party systems. Even now, there is no single XML vocabulary agreed in this arena, but the use of XML at least ensures that transformation between different vocabularies is achievable.
4. Looking to the Future
The RPSGB now has a standard means of delivering all Pharmaceutical Press content via a generic browser platform and are well positioned to meet the requirements of the new eNHS. At present, there seem to be two streams of activity in healthcare XML, one with a document oriention and one with a message orientation. The RPSGB expects these to converge: messages are just small documents and documents are just big messages. In either case, the XML-based systems at the RPSGB can feed the required systems in the NHS.
Xpath, the standard for addressing components of XML documents is gathering momentum, but there are some performance issues with addressing content in large documents. Xlink and Xpointer can be combined with Xpath to allow links between documents to be stored independently of the documents themselves. This offers some exciting opportunities for transforming the data set of the BNF into a fully interactive, reusable, knowledge base. Topic Maps, already an ISO standard and now moving to XML, appear to offer a way to express ontological structures in a standard way, which could be of great significance in the development of new applications of the data now produced in XML by the RPSGB.


