|
Analysing XML health records
|
 |
What started out as a quest for a way of auditing free text gradually
turned into a full electronic
patient record system. The
XML information
system in Oswestry has been live for the past two years and contains half
a million documents covering 50,000 patients. As elsewhere in the world the
British National Health Service
is under tremendous pressure from rising patient expectations and the inflationary
pressures of technological medical advances. Failures of the system are often
in the news whether these are related to human fallibility, criminal activity
or simple inappropriate resource allocation. Medical data structured in XML
allows examination of clinical activity with a power and scope never previously
possible. Whilst a static data set can be indexed and searched in context
using one of the many available
SGML/
XML aware systems dynamic data presents a
greater challenge.
Introduction
The separation of content from rendition has many advantages which do
not need reiteration here. In 1992 when we first started to develop the precursor
to the Oswestry system the major advantage was that
SGML
would allow us to perform complex searches of large quantities of information
to produce efficient result sets. Gradually we were diverted from our original
mission as clinicians saw electronic delivery of their data and said "yes,
that's the way it should look". Firing small
XML
documents into a browser through the hospital network giving an apparently
instantaneous access to the records leads to the ability to transfer some
medical activities to the electronic medium.
As the document repository became a sizeable mass of clinical information
we again started to examine the strengths of
XML
with respect to searching. When presenting the Oswestry system to groups of
information technologists a favorite pastime is to show the input tools and
the browser with all it's tricks and frills and then ask "could you do this
with
HTML?"
Usually there are some nervous looks and then one hand followed by a few more
go up. The answer up to that point is that to an extent it is possible. Where
are the strengths of
XML which represent
unique advantages making this technology irresistible for system implementers?
In Oswestry we have a number of legacy systems which we are not able
to replace with open
HL7 aware
products.
XML's ability to capture
the structured output of these systems and output them to a repository or
even another non compliant system is unique in it's flexibility. With
XML around who would ever try
EDI (Extremely Difficult Interfacing)
as a method of linking two disparate systems? The specification for
EDI runs to 2,7000 pages whilst that for
XML to 26! The ability to transform data is
not generally appropriate for health care text data but can be useful for
laboratory results where different units and scales need reconciling and where
drug doses may need turning into milligram of drug per kilogram of bodyweight.
In-context searching is the other
killer app which
XML brings to healthcare
in a formulation which combines relative affordability with the power necessary
to enable effective management of the medical process. It is not possible
to manage the clinical process unless you have the ability to examine the
medical record. The progress made so far in our efforts to develop an effective
search capability in
XML and the
scope for future development are the subject of this paper.
Static or dynamic data
In 1996 we were fortunate to be funded by the Information Management
Group of the
NHS to conduct
a study of live clinical information collection with arrangements for static
presentation of the resulting data as an anonymised document collection. We
chose SoftQuad's Explorer product as the vehicle for delivery of the fully
indexed and searchable collection. Partial records covering 700 patients were
collected during a three month live phase of the trial. We were able to demonstrate
the principles of in-context searching of clinical data.
The following year Graphnet built a series of markup engines to apply
markup to four years worth of legacy word documents consisting of clinic notes;
operation notes and ward round notes. A Q&A database of 13,000 discharge
summaries was marked up as was information from the pharmacy and physiotherapy
'stand-alone' data bases at the hospital. The result of the legacy extraction
was to produce a large quantity of data which could be queried to answer some
preliminary questions about the scalability of searching. We stored the data
as a pre indexed repository within Inso's Dynatext environment. Even complex
text searches could be undertaken with the Dynatext search engine. If combinations
of numbers such as dates were searched for the system slowed down very significantly
because the indexing did not take these into account. The ability to produce
complex transformations on numerical data will be important for the final
system. For example "find all patients aged over sixty who had an operation
last year" requires not only date range functions but also some manipulation
of the patient's date of birth and the operation date to give the correct
result set.
XML data types offer
a significant advantage over
SGML
in handling dates in a robust fashion.
The current system which has been operating live in Oswestry for the
past two years differs from the previous two search demonstrators in that
it is a live system with documents being added continuously during the working
day. Perhaps saving a copy every night and indexing that would be a solution
but there are two major difficulties which led us away from that approach.
Firstly even if one has a repository with the 500,000 documents in it as a
static data set, the registration process to turn the data into a fully indexed
searchable repository is not currently either quick or automatic. a second
consideration is that some of the most interesting events are the ones happening
just recently and the frequency of the downloads would have to balance need
for up to date information with practical considerations. We eventually decided
upon a relational solution with pre-indexing building extra components to
allow true cross document searching.
Source granularity
We have followed a very traditional implementation model for
SGML starting with the development of an
understanding of the structure of the information needed to allow the clinical
processes within the hospital. Secondly we developed a system for capturing
that structure efficiently from dictated text and marking it up in
XML The final step is to process and render
the documents and contained data. We generally did not increase the granularity
of the data beyond that which comes naturally to clinicians. A clinic note
would have an history; examination; radiographs and opinion & plan elements
which follows the
SOAP model of medical records. Where patients were to be admitted
to hospital or operations were planned we offered the option of adding extra
granularity to the record by using elements to indicate various details of
the planned admission. An example of extra granularity added because we had
the ability to re-process our documents was to add an inpatient post operative
instruction element and one for post operative outpatient instructions because
these were aimed at different groups of carers.
Markup designed to enhance search capabilities consists of text fields
for diagnoses, complications and co-morbidity. The code word elements also
have attributes to enable codes to be attached from a thesaurus for subsequent
processing although the DTDs generally favour the use of elements rather than
attributes where extra information is required. We have not made the diagnosis
keywords mandatory but are planning to influence the dictation behavior of
clinicians by feeding back their coding rates relative to their anonymised
peer group. Where an organisation must reliably collect information then making
the relevant elements mandatory is necessary but the cultural change to make
clinicians understand and commit to entering the information must
precede the change. Altering the data structures as a driver of change will
simply alienate the users with respect to the information system.
Examples in practice
Our current search ability is limited by a variety of factors which
are being progressively resolved. The major limitation at present is the presence
of some non XML text entry systems, thus we are not able to undertake our
final and definitive extract of legacy data until these systems have been
replaced. Any search currently undertaken cannot provide a full examination
of the hospital's clinical information. Even when we have a final legacy extract
taking our text data back to 1994 the ability to apply markup to historical
text is somewhat limited and generally allows the identification of the type
of document, the author and the secretary as well as the date of the document.
For operation notes the anaesthetist is also often identifiable in an automated
legacy extract. Searches including data from the legacy source thus are less
accurate because they are less focussed than those conducted on prospectively
marked up data.
An example of the use of the system so far was to search for occurrences
of operations where the Lautenbach procedure was performed. This is a technique
of dealing with severe bone and joint infections which uses large quantities
of extremely expensive antibiotics. The pharmacy budget was being significantly
stretched by the use of this technique but we had no way of discovering the
number of cases which had had Lautenbach operations because there is no specific
code for the procedure in our hospital's coding system. A search on "Lautenbach
inside <opnote>" revealed 27 cases. A further search for misspellings with
wildcards within and at each end of the word failed to reveal any missed cases.
We were able to identify the cases by surgeon so that the surgeon performing
the most of these cases could be brought into a discussion as to the best
way to recoup costs through purchasing agreements with purchasing Health Authorities.
Sophisticated enquiries require an iterative search method to produce
the correct levels of specificity and sensitivity for any given characteristic
of interest. Once a search has been built then there needs to be the ability
to run the search on a regular basis to give a continuous monitoring of the
quality of care. A flagging system can be used where limits are set for acceptable
performance and any exceptions reported by E-Mail to the person responsible
for monitoring clinical quality.
Lessons from system failures
Both in the United States and the United kingdom there is extensive
evidence of significant numbers of patients coming to harm as a result of
their treatment. The National Academy of Science estimated that between 44,000
and 90,000 Americans die each year as a result of errors in treatment. Approximately
7,000 of these errors are connected with drug administration errors. In the
United Kingdom a recent crisis in public confidence in medical care arose
because of a children's cardiac surgical unit pushing on with a program of
treatment in spite of evidence that their results were less good than could
be expected. even with an effective internal clinical data monitoring system
in place, the cardiac surgical tragedy might have occurred thus review of
the results needs to be conducted externally to the audited organisation.
Lessons from criminal activity
A recent criminal trial of Dr Harold Shipman revealed the lack of information
available to assess the activities of self employed family doctors. Dr Shipman
was convicted of the murder of a number of his elderly patients either at
his surgery or at their homes. He was only caught when he forged the will
of one of his victims in spite of various people raising their concerns over
the previous few years. The Health Minister promised a thorough review of
all aspects of the case including why those who should have been reviewing
Dr Shipman's clinical activity failed to act. The simple fact is that those
responsible for Dr Shipman had no information on which to act.
The Electronic Health record proposed by the
NHS
Management Executive would have allowed a regular review of death rates for
each family doctor so that the consistent excess mortality produced by Dr
Shipman's criminal activities could have been investigated. The Electronic
Health Record is a summary record which contains a birth to death log of events
and needs to contain marked up death certificates so that the final outcome
can be analysed in the context of the treatment received. Ultimately catching
the very occasional criminal will be much harder than identifying substandard
practice because of the covert way in which a premeditated criminal will cover
his tracks.
Lessons from resource mis-allocation
All healthcare systems are under intense funding pressure and any inappropriate
expenditure will lead to the overall benefit delivered to a community being
less than optimal. Within our hospital's region there is a fourfold variation
in the rates of total hip replacement. The disparity in hip replacement rates
implies that in some areas patients who would benefit are going without surgery
whilst in other areas patients are being operated on inappropriately. To address
inequalities in resource allocation there is a need to include score data
for disability or disease scale scores on a general basis so that data can
be found on the relative levels of pre-operative disability in the areas of
under and over provision. a simple reliance on records entered as a result
of routine care will be insufficient to produce an effective answer concerning
clinical activity. If the organisation requires the extra granularity brought
about by the use of score systems then the culture needs changing to one of
acceptance before the markup is changed to mandate scoring.
Conclusions
- Efficient high quality medical care requires analysis of the clinical
record
- The bulk of the medical record is text and only truly accessible
with SGML or XML
- Areas of benefit include quality assurance, resource management
and research
- The quality of search depends upon the level of markup available
and the validation of the element contents by the inputting system
- Searches must be constructed with a thorough knowledge of the data
structure and the limitations of the search system
Acknowledgements
I wish to thank the Electronic Patient Record Project Board for their
funding of the two Oswestry pilot trials.
Graphnet Computer Services Limited built the search engine and provided
legacy data extraction to give the large body of data which was needed to
develop the searching system.
Bibliography
| [1] | The Standard Generalized Markup language for electronic patient
records. Roberts A.P. Health Informatics Journal September 1998. Sheffield
Academic Press. |