Creating semantically valid topic maps
Geir Ove Grønmo
Find


Abstract
A topic map, like a database or XML document, provides a way of structuring information and as such requires methods for specifying the rules that describe the allowed states of the map.
This article discusses the need for a mechanism for defining constraints on topic maps to make sure that topic maps are semantically valid according to the intents of the topic map designer. As the paper will show the need for a constraint system is very much there.
Examples of constraints are given in order to give the readers some ideas of how far such a constraint system could be taken.
Existing constraint mechanisms are mentioned and their applicability for describing constraints on topic maps are discussed.

Contents
  1. Introduction
  2. What are constraints anyway?
  3. Validation
  4. Ontologies
  5. Roles
  6. Schemas - the powerful combination
  7. What's there today?
  8. What does the standard have to say about constraints?
  9. Which objects are subjects for the constraint mechanism?
    1. Associations
      1. Examples
    2. Topics
      1. Examples
    3. Names
      1. Examples
    4. Occurrences
      1. Examples
    5. Scopes
      1. Examples
    6. Other types of constraints
  10. Other constraint mechanisms
  11. How to apply constraints?
  12. How to best describe constraints?
  13. Conclusion

Introduction
The recent publication of the topic maps standard has attracted a lot of interest. This is not surprising since it is a very powerful standard that allows you to create navigational structures that hasn't been possible in a standardized way before.
As people get started with designing their own topic maps they'll soon realize that it is easy to loose the control of its consistency. That is mainly caused by the fact that topic maps easily get rather complex.
Creating and maintaining topic maps without the help of tools introduces the possibility of inconsistency.
Reasonably the standard has only a very limited set of mechanisms for defining constraints on the map. Almost none of those may be used to constrain the semantics defined by the user. The main focus of this article are constraint mechanisms for user defined semantics.
An example of user defined semantics: Associations of type contains must have a container role and a containee role to be meaningful.
Note that this article does not focus on the syntactical representations of topic maps constraint mechanisms.
Previous Previous Table of Contents
What are constraints anyway?
Constraints are contracts that are agreed upon by suppliers of a service and receivers of the result of that service. The contract define the conditions under which the services will be provided and a specification of the result of the service that is provided, given that the conditions are fulfilled.
Another definition of topic map constraints could be: A restriction on one or more of the properties values of nodes in the topic map grove1.
In the context of topic maps this mean that the designer of a topic map ontology2 and the topic map editor agree upon the rules that define what constitutes a valid and consistent topic map.
Because of the likelyness of inconsistencies the user needs something that helps her to maintain the consistency. Computers are good at this. A computer should be able to guide the user in the right direction and inform her whenever something is incorrect.
Previous Previous Table of Contents
Validation
Constraints and validation are related. Validation is merely the act of checking the validity of objects according to a set of constraints. Validation is necessary for all kinds of sophisticated information processing. When designed correctly a constraint system need not be a burden for the user, but would rather guide the user in designing correct topic maps.
Previous Previous Table of Contents
Ontologies
All topic maps contain a set of topics that are privileged. These topics are the fundamental semantic building blocks of a topic map and they are the set of topics that other topics and their characteristics are built upon.
When designing topic maps you must give these topics extra attention, since they are so important for how the map is built. As you will soon see the definition of these are also very important to a constraint system.
The set of privileged topics and their characteristics, including associations between them, is what we can call the topic map ontology.
Topic types, association types, occurrence types, facet types, facet value types and themes are examples of ontology topics. One could say that in some sense an ontology contains only abstract topics - topics that should only be used as types and themes in a real map.
Type hierarchies can be built by introducing supertype-subtype associations. This makes it possible for ontology topics and derived topics to inherit properties from each other.
Associations between ontology topics can be very powerful, since they can be used by inference engines.
It usually proves very valuable to put much effort into the details of the ontology design.
Designing a topic map ontology can in some ways be compared to defining the elements and attributes for SGML documents.
Previous Previous Table of Contents
Roles
A short descriptions of the roles that participate in the design and creation processes of a topic map is in place. Two roles are described; the designer and the editor. In practice these two roles may be overlapping or even played by the same person.
The schema design is the responsibility of the topic map designer.
The designer should be an expert that knows the domain the ontology is supposed to cover. He must make sure that it is to the greatest extent impossible to create topic maps that are semantically invalid.
When the schema has been designed topic maps that use the ontology can be created. This is where the editor takes over.
The editor must use the ontology to create new topic maps and make sure that the topic map objects adhere to the constraints defined for that ontology.
Previous Previous Table of Contents
Schemas - the powerful combination
A topic map ontology combined with constraints is what we can call a topic map schema.
This is to some extent the same ingredients as in SGML/XML DTDs and XML schemas. The elements and the attributes define the ontology, while the content models and datatypes define the constraints.
The great thing about schemas is that it can function as the documentation for instances based on that schema. The schema could be all that is needed to understand how topic maps are to be created based on the schema.
Several other interesting possibilities can be derived from this, some of which are:
Previous Previous Table of Contents
What's there today?
As mentioned earlier the standard has very little to say about constraints on user semantics. There is almost nothing in the topic maps standard that assists the editor in saying anything about how the objects in your topic map are to be interpreted, less so what incorporates valid or invalid use of them.
It is actually a good thing that no user semantic constraints are defined in the standard, since the number of possible semantics are in practice unlimited. The user semantics of topic maps are dependent on the universe of discourse.
Contrasted to the SGML standard the topic maps standard has no mechanism for applying constraints on ontologies. SGML has a special language for defining constraints on documents called DTDs (Document Type Definitions). There is no such thing defined for topic maps.
A topic map can be serialized in SGML format, but that doesn't help much. The constraint requirements needed for topic maps are quite different than the ones that are defineable for SGML documents.
The next section discusses the constraint mechanisms that are described in the topic maps standard.
Previous Previous Table of Contents
What does the standard have to say about constraints?
The standard is explicit about the fact that it does not constrain the uses to which topic maps can be put. The following note is taken from the section about conformance.:
NOTE 50 This International Standard constrains neither the uses to which topic maps can be put, nor the character of the processing that may be applied by a conforming application. This conformance clause is intended to guarantee that conforming topic maps can be understood to whatever degree conforming read-only applications are intended to understand them, and that the topic mapping information expressed using the topic map syntax will be preserved by conforming read/write applications (except to the extent that the users of read/write applications deliberately alter that information).
The constraints mentioned by the standard apply to topic maps and topic map objects in general. There are no constraints that apply to the semantics defined by the map designer.
The standard defines mostly syntactic constraints. A few semantic constraints are defined as well, but they are on a different level than those that are the focus of this article.
Following is a list of the types of constraints defined by the standard:
The list of constraints mentioned here is most likely not complete, but the intension is to show the types of constraints described by the standard.
It should also be noted that the standard sometimes is very explicit about the fact that it does not limit its uses. An example of this is information resources and their relations to topic map objects:
This International Standard imposes no constraints on the nature of information objects that can be specified as occurrences of topics, nor on the addressing notations used to reference such occurrences.
Previous Previous Table of Contents
Which objects are subjects for the constraint mechanism?
Some topic map objects are more suitable subjects for constraints than others. The most important one is definitely the association.
The list of constraint types listed below are not complete. It is primarily intended to be an introduction to which kinds of constraints can be defined. The listed constraints are atomic and are meant to exist in combinations with operators and other constraints.
The examples are chosen arbitrarily, but they can hopefully make things clearer.
Associations
The association type is the primary starting point for describing a constraint for a set of associations. Associations of a given type are very likely to have some strict rules of how those associations should be structured, especially how the roles are be combined.
Associations can be constrained using the following criteria:
Note that the ordering of association roles is not significant.
The goal must be that the association type, association role type and participating topics are combined so that they form a meaningful combination.
Examples
Topics
The topic type is the primary starting point for describing a constraint for a set of topics. Topics of a given type are very likely to have strict rules for how the characteristics of those topics are put together.
Topics can be constrained using the following criteria:
Note that the ordering of characteristic assignments is not significant.
Examples
Names
Topic names are containers for sets of base names, display names and sort names.
The base, display and sort names are basically strings, so they can be constrained in the same way as any other string.
Name can therefore be constrained using the following criteria:
Examples
Occurrences
Occurrences could be restricted by what kind of information resources they are locating and how those resources are addressed.
Examples
Scopes
All topic characteristics; assocations, names and occurrences, are subjects to be constrained by their scope.
Scopeable objects can be constrainted using the following criteria:
Examples
Other types of constraints
Previous Previous Table of Contents
Other constraint mechanisms
The following constraint mechanisms are some of the contestants for becoming a mechanism for specifying constraints on topic maps:
Note that XML schemas is not on this list, since it does not give support for the kinds of constraint mechanisms needed for topic maps. This critisism is mainly directed at its lack of inter object constraints.
Previous Previous Table of Contents
How to apply constraints?
The validation procedure normally consists of two steps. These are:
Previous Previous Table of Contents
How to best describe constraints?
The author's experiences indicate that a complete programming language is needed to describe the possible constraints that could be needed to describe all kinds of constraints on topic maps.
The most flexible solution is to use an existing programming language to describe constraints in terms of the object model api of the topic map system.
Even though this would mean that a programming language was involved, user interfaces would easily be created on top of that, so that anybody could be able to describe constraints, not just programmers.
80/20 solutions can be created by removing the most complex constraint requirements. This would make the language much simpler, and it would no longer have to be a complete programming language.
Early experiments show that topic maps themselves can be used for creating a constraint mechanism. This mechanism is based on a templating technique that constrain the instances of those templates.
Previous Previous Table of Contents
Conclusion
A constraint system that is able to define constraints on topic maps are sorely needed. Before deploying real world topic maps it is very useful, if not necessary, to be able to make sure that its semantics really are the same as the ones you intended it to have.
Let's hope that the topic map community is able to come together to agree upon a language for defining constraints. Without it we would end up with a lot of different and incompatible languages. That would be very unfortunate and probably limit the interchangeability of topic maps.
Previous Previous Table of Contents