Next: Work Part #integration>: Up: 2.2 The Workplan Previous: Work Part #uncertainty>:

Work Part 3: Modelling the content of multimedia data

Work Part 3 - ``Modelling the content of multimedia data'' is organized into six mains tasks.

We give herafter a short description of WP3.

WP3 - Objectives
The objective of WP3 ``Modelling the content of multimedia data'' is to provide a model of the semantic content of multimedia documents that is consistent with the theories developed in Work Parts 1 and 2. These investigations will be limited to three media which are considered of primary importance, and paradigmatic for multimediality as a whole: text, images and graphics. The investigation will mainly focus on the specificities of each of these media, i.e. on the kind of the information that a document conveys because of the nature of the medium in which it is expressed, because of the way users tend to retrieve documents expressed in this medium, or because of the way users tend to behave while retrieving them. In other words, Work Part 3 intends to provide an explicit model of the otherwise implicit knowledge embedded in documents expressed by means of various media, with a particular focus on the knowledge that users more specifically use for retrieving such documents. A common feature of all these studies will be the consideration of uncertain information, due to the fact that indexing techniques may deliver uncertain recognition of patterns, and hence of semantic data. Each medium presents its own peculiarities about the way raw information is structured, and hence its underlying semantics is itself peculiar. This structure is obviously part of the expression of the semantic content of documents, and because of this its investigation is an integral part of this Work Part. In order to be properly integrated in the overall, logic-based retrieval model, all these ``partial'' models have to be integrated into a unified one that fits with the theories developed in Work Parts 1 and 2

WP3 - Approach
This Work Part is tightly related to Work Part 1; a close collaboration will then be established concerning the design of the specific logic used for describing the semantic content and structural aspects of documents. The first step will be to fully investigate the peculiarities of each media and to consider their impact as possible clues to be used for retrieval purposes. In order to gain general knowledge about users' needs and retrieval problems related to specific media (such as images), information will be collected from real users. The general formalism of conceptual graphs will also be adapted to the particular needs of representing the semantic content and the structural aspects of these media. This will provide a first model of multimedia documents that will greatly help in providing a global model based on logic and consistent with the theories developed in Work Parts 1 and 2. This intermediate model will also have to account for the notion of uncertain information (basically the probability that a given pattern have been recognized). It should be clear, from another point of view, that this activity about modelling multimedia data cannot completely be undertaken without considering the indexing techniques that could generate this modelling from raw data. Though the effort will not be devoted to the design of such techniques, ignoring this problem would undoubtly lead to poor applicability of the proposed data model.

WP3 - Expected results
The main result that is expected from Work Part 3 is a model that combines the various aspects mentioned above for the three basic media that will be studied.

We now give a concise description of the objectives, approaches taken, and results expected from the five tasks which constitute this Work Part.

T31 - Objectives
The objective of T31 ``Modelling textual information'' is to identify and formalize the various features of textual documents which are relevant for retrieval purposes. Rather than the representation of the semantic content of a textual document, the objective here is to consider the text as part of a multimedia document, and hence to investigate the relationships between textual and non-textual information as indicated by the text itself (e.g. references, captions, etc.).

T31 - Approach
The researchers involved in T31 have accumulated a considerable experience in text retrieval over the years, and that will be exploited here. Given that the fundamental problem here is to provide a knowledge representation of facts described by natural language sentences, the modelling activity will be based on noun phrase interpretations. This is a commonly accepted compromise between retrieval needs and state-of-the-art technology of automatic indexing of texts. Noun phrases provide precise expressions of complex concepts occurring in documents, and have a much greater expressive power than classical keywords. They are also linguistically manageable (though not simple at all, considering syntactical ambiguities and semantic ambiguities). As stated before, plans are to base this modelling activity on conceptual graphs, which are extremely apt for this purpose. Textual references to non textual components will be also modelled in this way.

T31 - Expected results
The result expected from T31 is a conceptual-graph-based model for the semantic content of textual documents, based on knowledge embedded within noun phrases and links denoted by textual references.

T32 - Objectives
The objectives of T32 ``Modelling images'' are to identify classes of knowledge that are most relevant for retrieving images, to identify from what features of the images this knowledge may be inferred, and to provide a proper formalization for that knowledge. Because indexing methods for images may provide uncertain identification of their semantic content, the notion of uncertain information has to be included in this model.

T32 - Approach
Much less experience has been accumulated in image retrieval than in text retrieval in the literature. We plan to gain knowledge about users' needs and users' behaviour while retrieving images, by interacting with users of large image retrieval applications (such as ESA). An important research issue is to capture the kind of knowledge which may be considered as ``image specific'', hence useful for retrieving this kind of data. In other words, we think that users do not ``think" about images they want to retrieve in the same way they would think about e.g. texts. It is this difference that has to be identified as precisely as possible. As stated before, we plan to base this modelling activity on extended conceptual graphs, as they seem particularly apt for this purpose. The general model of conceptual graphs will have to be extended so as to cope with uncertain information.

T32 - Expected results
The result expected from T32 is a conceptual-graph-based model for the semantic content of images.

T33 - Objectives
The objective of T33 ``Modelling graphics'' is to identify and formalize the various features of graphics which are relevant for retrieval purposes.

T33 - Approach
As for images, an important part of the activity of T33 will be to gain knowledge about users' needs in retrieving graphical information. Considering, for example, charts and arrays, one may notice that this information is often semantically associated with textual information which defines or complete the semantics of the document. This relationship has to be properly modelled. Classes of graphical objects will have to be defined and their peculiarities identified and semantically characterized. Similarly to the case of images, we will have to capture the kind of knowledge which may be considered as peculiar to the case of graphics, distinguishing it from general, domain knowledge. Given that graphics is highly structured information, graphics modelling will deeply rely on available knowledge about standards, the problem here being to investigate to what extent a semantic can be assigned (and used for retrieval purposes) to these structural properties. Again, we plan to base this modelling on conceptual graphs, which are particularly apt for this purpose.

T33 - Expected results
The result expected from T33 is a conceptual-graph-based model for the semantic content of graphics.

T34 - Objectives
The objective of T34 ``Modelling structures'' is to accomplish a first synthesis of all the medium-specific structural aspects investigated in T31, T32 and T33, in order to provide a unifying view of this important aspect.

T34 - Approach
The approach taken in T34 will be based on an extended conceptual-graph-based model which integrates all the features identified before. The associated theory, based on the definition of concepts, classes, relations, lattices for concepts and relations, and operators, will be designed. As mentioned before, this model will have to include representational primitives for uncertain information, as uncertainty may also affect the structure itself. On the other hand, structures will be most probably viewed as semantic relations considering the arguments (concepts) of these relations. A consequence is that this task will involve the design of a complete theory of extended conceptual graphs. This activity will be coordinated with WP1, as it will provide a specification for aspects related to the modelling of document content. It will be also coordinated with WP2 for aspects dealing with the notion of uncertain information.

T34 - Expected results
The result expected from T34 is a model of the structural aspects of multimedia documents, based on extended conceptual graphs theory, that will encompass in a unifying view the models developed within T31, T32 and T33.

T35 - Objectives
The objective of T35 ``Integrated Multimedia Model'' is to design a logic that will encompass in a unifying view the models developed within Tasks T31-T34, and that is consistent with the theories developed in Work Parts 1 and 2.

T35 - Approach
This task will use the preliminary model designed in T34 as a formal specification, and will obviously be undertaken in close collaboration with WP1 which aims in particular to the modelling of those structural and semantic content of documents that are not specific to the multimedia case. The semantics and the syntax of the sought logic will have to reflect all the semantic properties stated as a specification by the intermediate model developed in T34. The approaches based on terminological logics that are proposed in WP1 seems a good starting point, though extensions will have to be made at least to deal with the notion of uncertain information. As stated before, this is also related to the work of WP2.

T35 - Expected results
The result expected from T35 is a model of the semantic content of multimedia documents that is consistent with the theories developed within Work Parts 1 and 2.

T36 - Objectives
The objective of T36 ``Prototyping'' is building a prototypical implementation of the multimedia data model developed in T35.

T36 - Approach
This task will have again to be strongly related to experiments planned for WP1. What we can foresee at the moment is that the underlying model of conceptual graphs for knowledge representation and manipulation will be of great use for supporting both effective representation of multimedia documents, and the inference processes that will be designed in WP1.

T36 - Expected results
The result expected from T36 is a prototype of the logic resulting from T35 that will be the subject of evaluation (in Work Part 5) against a multimedia document base of realistic size.

The participating (P) consortium members for each of the Tasks in Work Part 3 are listed in the following table.



Next: Work Part #integration>: Up: 2.2 The Workplan Previous: Work Part #uncertainty>: