Next: Work Part #multimedia>: Up: 2.2 The Workplan Previous: Work Part #logic>:

Work Part 2: A theory of uncertainty for information retrieval

Work Part 2 - ``A theory of uncertainty for information retrieval'' is organised as follows:

A short description of WP2 follows.

WP2 - Objectives
The goal of WP2 ``A theory of uncertainty for information retrieval'' is to develop an appropriate theory of uncertainty that accounts for the imprecision inherent in the information retrieval process, and that that is consistent with the theories developed in Work Parts 1 and 3.

WP2 - Approach
In order to achieve the objective described above it is possible to follow two different but converging approaches: 1) start from a probability theory apt for information retrieval, and investigate its induced logic; 2) extend a non-probabilistic logic in a probabilistic sense. These two approaches, which will be investigated respectively in Tasks T21-T22 and T23, are hoped to converge to a unique logical theory for dealing with uncertainty in multimedia information retrieval.

The underlying motivation for the fomer approach is that in IR we now have a fairly good idea on how to model retrieval probabilistically; from this we should be able to derive a logic that supports probabilistic inference in IR. Conditionalisation takes a central part in probabilistic inference. It is a form a belief revision, where beliefs are assumed to be represented by probability functions. The present use of Bayesian conditionalisation in IR induces a weak logic, the C2 conditional logic, upon which many probabilistic IR models are based. We intend to investigate the use of other forms of conditionalisation, such as Jeffrey conditionalisation or imaging. However, it is not known what the nature of the induced logic will turn out to be be under these last two forms of conditionalisation.

The latter approach relies instead on the fact that the denotational semantics with which the logics investigated in Work Part 1 are endowed allows them to be extended so as to accept a probability measure, with the practical result that the various kinds of conditionalisation should easily be expressed in the resulting logic.

WP2 - Expected results
WP2 is expected to yield a theory of uncertainty for information retrieval. We will investigate the difference between the two logics resulting from Tasks T21-T22 and T23, and strive to produce a unified logic. The result is likely to be a logic which will combine suitably with probability theory (henceforth, we will refer to it as Probability Logic).

WP2 is further structured into Tasks T21 to T23. We now give a concise description of the objectives, approaches taken, and results expected from each of these tasks.

T21 - Objectives
The objective of T21 ``User-oriented relevance probability'' is to develop a theory for dealing with the imprecision introduced in the process by the user's presumably imprecise representation of his information need, and in the presumably imprecise evidence provided by user relevance feedback. The user generated evidence is generally non-propositional, and comes about through interaction between a user and the information retrieval system.

T21 - Approach
The research will concentrate on investigating various forms of conditionalisation and belief revision. In particular the research will consist in investigating the use of non-Bayesian conditionalisation in order to develop a model for the treatment of uncertainty present in the revision of the user-perceived relevance of the document as resulting from the ``passage of experience''. Jeffrey's conditionalisation and the Dempster-Shafer theory of evidence seem to be two powerful tools to consider.

T21 - Expected results
The result of T21 is expected to be a probabilistic theory for handling the evidence provided by a user by means of an initial query formulation and/or by means on a process of evidence revision in the light of the response of the multimedia information retrieval system.

T22 - Objectives
The objective of T22 ``System-oriented relevance probability'' is to develop a theory of uncertainty that accounts for the system-perceived relevance of a document with respect to a query. This measure of relevance corresponds to the extent to which the entailment relation in the logic holds between a document and a query.

T22 - Approach
The research will start by considering the results already achieved by earlier probabilistic approaches to document indexing and matching. An attempt will then be made to extend these approaches so as to be able to handle multiple sources of evidence and to take into consideration dependency between documents or between representational primitives such as those identified by T11 and T12. Another direction in the investigation will be the use of conditionalisation by ``imaging'', which will enable the transfer of probability estimates among representational features according to their evaluated similarity.

T22 - Expected results
Task T22 is expected to provide a probabilistic theory which will enable an information retrieval system to evaluate, under uncertainty conditions, the relevance of a document with respect to a query.

T23 - Objectives
The objective of T23 ``Probabilizing a non-probabilistic logic'' is to develop an appropriate theory of uncertainty that accounts for the imprecision inherent in the information retrieval process, and do this by an approach alternative to that adopted in Tasks T21 and T22.

T23 - Approach
The approach taken in T23 will be to start from the non-probabilistic logic identified in Work Part 1 and extend it by allowing the expression of a probability measure. Earlier work suggests that at least two different views of probability can be embedded into a non-probabilistic logic, i.e. ``probability as statistical information'' and ``probability as degree of certainty''. Altogether, these are deemed sufficient to express the various kinds of conditionalisation with which experiments in information retrieval modelling should be conducted.

T23 - Expected results
The result expected from T23 is a logic that combines the representational features identified in Work Part 1 for the representation of the structure and content of documents and queries, and the representational features needed to express both ``probability as statistical information'' and ``probability as degree of certainty''.

T24 - Objectives
The objective of T14 ``Prototyping'' is building a prototypical implementation of algorithms for reasoning in the logics resulting from Tasks T21 to T23.

T24 - Approach
This task directly concerns the relationship between the theory developed within the three previously examined tasks and the efficiency of an information retrieval system based on this theory. The basic question addressed by this computational and prototyping task is whether a given theory of information retrieval possesses sufficiently good computational properties, such that it can be used as the formal basis of an information retrieval system. This will be a first evaluation of our theory, based on formal tools such as computability and complexity theory, which is to be understood as a prerequisite for any further consideration of the theory by other Work Parts. The probability logic developed though T21, T22 and T23 should take into account both forms of evidence described in T21 and T22 and combine them to produce a revised probability of relevance. Therefore we should have: . In other words the probability of relevance of a document to a user is a logical function of the uncertainty associated with a document entailing a query and the uncertainty associated with the user's opinion of the evidence. The computational properties of this evaluation will be investigated, with the aim of classifying the probability logic into a computational class. A probabilistic algorithm will be derived from the logic and its implementation will be tested in a small IR prototype system.

T24 - Expected results
The result expected from T24 is a prototypical implementation of algorithms for reasoning in the logics resulting from Tasks T21 to T23 and that will be the subject of evaluation (in WP4) against a multimedia document base of realistic size.

The participating (P) consortium members for each of the Tasks in Work Part 2 are listed in the following table.



Next: Work Part #multimedia>: Up: 2.2 The Workplan Previous: Work Part #logic>: