Topic Maps for repositories
Kal Ahmed
Find


Abstract
No abstract was provided for this paper.

Contents
  1. Introduction
  2. Current repository navigation techniques
    1. Hierarchichal browsing
    2. Searching
  3. Using Topic Maps for repository navigation
    1. Associative browsing
    2. Topic Map querying
  4. Topic cartography
    1. The system Topic Map
    2. The semantic Topic Map
    3. The user-defined Topic Map
      1. Individual workspaces
      2. Shared workspaces
      3. Knowledge management
  5. Topic Map GUIs
    1. Topic Map GUI approaches
  6. Conclusion

Introduction
This paper discusses the potential application of Topic Maps as an interface to a multi-user document repository; presents some possible implementation approaches to creating Topic Maps for a repository; and finally demonstrates some graphical tools for Topic Map navigation and creation.
Previous Previous Table of Contents
Current repository navigation techniques
Hierarchichal browsing
A common feature of nearly every repository system, is the use of a hierarchy of nested containers for organising and navigating through content. Typically the structure of any given sub-tree of a hierarchy is defined by a single user and used by all other users with interest in the content stored therein. Users of such a hierarchy are therefore constrained by the structure imposed by the administrator. For small repositories, or repositories used by a single person, such an organisation may work well - most users are relieved of the task of organising the content and need only to learn where to look for the items of interest 1.
As a corpus increases in size, it gets progressively harder for a user to learn the structure of the hierarchy unless it matches the way in which that user mentally organises or works with the content. Of course, not all systems are rigidly controlled by a single administrator - many systems provide the freedom for users to create and manage a sub-tree of the hierarchy - but this simply leads to more confusion - without a pre-arranged classification system, any coherence in the organisational structure is lost as multiple organisational criteria are squeezed into a single system.
Searching
Browsing is not the only way to find content in a repository - most repositories also support searching. Most often this is provided for the benefit of those who consume the data, rather than for those responsible for creating and maintaining it - giving users a way to completely bypass the structural organisation of the data. However, the results of the query can only be as good as the query itself. An under-specified query will result in an unmanageably large set of hits, and an over-specified query might miss a piece of relevant content. Furthermore, a query across repositories of differing types requires that those repositories define a common set of meta-data with commonly agreed semantics for a combined search to return meaningful results.
Previous Previous Table of Contents
Using Topic Maps for repository navigation
Associative browsing
Figure 1 . Associative browsing with a Topic Map
Topic Maps provide the casual browser of the repository, with a richly cross-linked structure over the repository content.
Topic occurrences create 'sibling' relationships between repository objects. A single object may be an occurrence of one or more topics, each of which may have many other occurrences. When a user finds/browses to a given repository object, this sibling relation ship enables them to rapidly determine where there are other objects regarding the same topic as the current one. Topic associations create 'lateral' relationships between subjects - allowing a user to see what other concepts covered by the repository are related to the subject of current interest and to easily browse to them. Associative browsing allows an interested data consumer to wander across a repository in a guided manner. A user entering the repository via a query might also find associative browsing useful in increasing the chance of serendipitous discovery of relevant information.
Topic Map querying
Figure 2 . Topic Map querying
A Topic Map can be used to provide a useful higher-level abstraction across one or more repositories. Topic Maps provide a number of useful features for query-based access to the repository:
Previous Previous Table of Contents
Topic cartography
The creation of useful Topic Maps should become a prime concern to the creators of large corpora. Strategies for the creation of such maps would be driven by the requirements of the Topic Map user and by the constraints of the authoring environment. Broadly speaking, 3 types of Topic Map can be identified:
The system Topic Map
The System Topic Map is a Topic Map which represents the structure of the underlying repository. Characteristics of repository objects are directly mapped to Topic Map constructs - these include such characteristics as the location of the object and object meta-data. Such a mapping could be made dynamically by an agent interposed between the topic map engine and the underlying repository and may be combined with other topic maps on-the-fly by a processing application.
Figure 3 . The system Topic Map
A key use of a System Topic map would be in creating a bridge between the repository and the Topic Map environments. It may be easier for someone used to navigating through the repository directly to get used to a Topic Map view of a repository if there are 'landmarks' which map directly to the underlying structure. Where time and effort has been spent in creating a hierarchical organisation of data in a repository, the System Topic Map provides a portable means for capturing the result of that effort. Many organisations have already made the choice to store portable data (XML, SGML etc.) but a move to a new repository can lose all of the effort and knowledge encapsulated in the organisation and repository-level meta data associated with the content.
A further use for a System Topic Map is in combining multiple repositories of the same type into a single 'virtual' repository which can be browsed seamlessly. A single topic map application could communicate with and merge the output of multiple system topic map engines.
The semantic Topic Map
The Semantic Topic Map is generated by automatically extracting meaning from the content of the repository and representing the connections made by analysis of that meaning as a Topic Map. Whereas in a System Topic Map the topics represent repository objects; in a Semantic Topic Map the topics represent concepts described by one or more repository objects.
Content analysis may be simply driven by such characteristics as document structure, meta-data or contained hyper-links. Typically a well-marked up, well cross-linked corpus will generate a good Semantic Topic Map. A more complex approach, might make use of linguistic analysis to extract meaning from the textual content of documents.
Figure 4 . Semantic Topic Map generation
While it is possible that some semantic analysis could be done 'on-the-fly', the processing overhead of some of the more advanced forms of analysis might make Semantic Topic Map generation and asynchronous process. The rules used for the semantic analysis are, in themselves, an important form of knowledge as they encode the way in which the relationships between repository objects are inferred from their content by users of the repository. Some semantic information may be encoded as associations between topics or as topic-occurrence relationships. Other semantic information may be extracted which applies to just a single repository object and this information may be represented using a facet.
When used to generate an index for a corpus, a semantic Topic Map provides indexing features above and beyond those of a standard static index. Scopes provide a means of quickly creating domain-specific indices - combining multiple domain-specific indices on demand would enables each user of the same corpus to create their own personalised index of that corpus. Facets provide easily searchable meta-data. Associations provide rich, typed cross-linking between conceptual areas.
The user-defined Topic Map
The User-Defined Topic Map provides an individual with a means of creating their own perspective on a set of data. User perspectives may be:
OrganisationalThe perspective maps the repository to enhance location/retrieval of data.
Knowledge-drivenThe perspective adds value to the data by asserting associations between repository objects based on some deeper understanding of the concepts represented by those objects.
Task-drivenThe data is organised according to the user's work processes.
User-defined Topic Maps have potential application in 3 areas:
Individual workspaces
Using a Topic Map to create an individual workspace gives the user a means of better managing access to frequently used documents and to organise data in multiple ways. Topic Maps can be used to create logical paths from an abstract concept to a specific document in a way which more closely matches the way the user thinks. Tools are needed to make the construction, maintenance and navigation of such Topic Maps as easy as possible and to integrate as tightly as possible with the day-to-day tools and processes. Topic Maps allow a user to relate single data instances to multiple subject areas - such as a standard text referenced from multiple projects. Topic Maps also give the application the freedom to link to resources in other tools (such as email, PIM systems and remote documents) - enabling the user to pull information from many disparate sources into a single coherent set for their use.
Tools are already available that aid in this form of personal organisation. Topic Maps may be used as an interchange format between such products and/or platforms - for example moving my mind map from my PC to my Palm and back or creating a 'mobile' workspace on an Internet-accessible site that can travel with me.
Shared workspaces
Shared workspaces enable users to share knowledge by communicating to each other the associations and relationships between data instances. Multiple Topic Maps may be combined with relatively little effort, to quickly generate a composite view of the same data set. Topic Maps created by individual users can thus be shared across an organisation, enabling many other users to gain the insights and benefit from the knowledge encoded in the Topic Map. As with any Topic Map application, data instances may be in a repository or located elsewhere within or outside the organisation - as long as it can be addressed in some way.
When user share their workspaces, Topic Map merging rules and applying additional scoping using added themes can be used to ensure that the perspective of different people are combined only to the degree desired by the end-user.
Knowledge management
Topic Maps can be used to encode ontologies prepared by one or more subject matter experts. Such a map may be used simply to transfer an ontology from one tool to another, or as a 'publishing medium' for an ontology. A topic map engine combined with other analysis tools (such as linguistic analysis tools) could be used to automatically annotate documents according to a given ontology and record the resulting annotation as a Topic Map. Again, Topic Map merging rules could be used to generate composite or comparative views of the same data set using different ontologies or analysis methods.
Previous Previous Table of Contents
Topic Map GUIs
Topic Maps may be statically published (as a subject index, for example) or more dynamically displayed to the user. For the types of 'workspace' applications described above, an intuitive GUI is a key requirement for success. Topic Maps enable users to create large quantities of meta-data and highly interconnected sets of data. The challenge for a GUI is to present this graph and the associated meta-data in a readily interpretable manner.
Data visualisation techniques are gradually entering the mainstream. As graphics hardware prices continue to fall and new software becomes available, building a compelling Topic Map GUI is becoming easier.
Topic Map GUI approaches
Topic Maps are essentially interconnected graphs with (potentially) many dimensions of meta-data. There are a number of approaches to the visualisation of such data already in the commercial domain:
Hyperlinked-TreesA graph can be interpreted as a hierarchical tree relationship with additional hyper-links between nodes. Topic Maps support this type of visualisation due to the hierarchical relationship between topics, associations and types and also the containment nature of the relationship between topics and occurrences and associations and association roles.
Standard GUI tools are capable of displaying this kind of tree. For example STEP's on-line topic maps (http://www.topicmaps.com ). Distorted tree visualisations such as InXight's Hyperbolic Tree Browser (http://www.inxight.com/demos/ht/index.html) enable a larger proportion of the hierarchy to be displayed in the same amount of screen real-estate as the traditional tree browser.
GraphsGraph visualisation displays the Topic Map as a set of interconnected nodes. A static graph visualisation simply displays the nodes with their interconnections. Dynamic graph visualisations limit the scope of vision of the user to the node of interest and all nodes within a certain distance. As the user shifts focus from node to node, the display of the graph changes interactively. This form of visualisation enables all of the connections in the Topic Map to be more equally displayed, rather than making a dominant hierarchical relationship and at the same time avoids overwhelming the user with the quantity of information contained in the Topic Map.
Mind mapping tools such as the Brain from Natrificial (http://www.thebrain.com) use this form of display quite effectively.
LandscapesAn interesting data visualisation technique is to display interconnected information as a landscape, assigning coordinates to topics according to their interconnections and height to coordinates according to the degree of relevance or the degree of convergence of multiple topics. This is the approach used by Cartia's ThemeScape product (http://www.cartia.com) to create the NewsMaps web-site (http://www.newsmaps.com) - not a Topic Map application, but the potential is there.
WorldsThe data model of Topic Maps seems to lend itself well to the construction of three-dimensional spaces. Topics may be assigned coordinates in 3D space according to specific characteristics. A static 3D world enables the user to 'fly-through' the Topic Map; to learn the 'lie of the land'; to meet other user's browsing through the same map or even to bookmark frequently visited locations. A dynamic 3D world would respond to the user's movements, bringing 'most relevant' topics nearer and moving others further away as the user's focus changes. Three-dimensional worlds are already implemented as glorified chat-rooms (http://www.activeworlds.com), perhaps Topic Maps provide a framework for putting these worlds to serious use.
Previous Previous Table of Contents
Conclusion
While the current focus for Topic Navigation Maps is on the creation of static publication indexes, there is significant scope for the use of the Topic Navigation Map standard in 'indexing' more dynamic data and to provide an organisational construct on top of one or more repositories.
Topic Map meta-data and facets provide a means of creating a common index across multiple repositories, allowing searching and browsing applications to treat many disparate repositories as a single virtual repository. Topic Map merging and scoping rules facilitate the sharing of individual Topic Maps, allowing users to benefit from the knowledge of others.
To move forward in the use of Topic Maps for these kinds of applications, development of compelling visualisation techniques is a must. Fortunately the tools to build these visualisations are becoming readily available and standard home and business hardware is already capable of advanced visual display which would have been prohibitively expensive only three or four years ago.
Previous Previous Table of Contents