Universität Duisburg-Essen
Startseite Arbeitsgruppe Informationsysteme

MIND

Resource Selection and Data Fusion for Multimedia International Digital Libraries


Duration:
From 01. 02. 2001 until 31. 12. 2003
Contact Persons:
Involved Persons:
Sponsored by:
  • EU FP5
Reference number:
IST-2000-26061, 0415053 (Dortmund), 15311571 (Duisburg)
Participating Institutions:

This research addresses problems associated with the emergence of thousands of heterogeneous multimedia Digital libraries distributed internationally on multiple platforms. Users have problems with resource selection as they are unaware of the contents of each individual library in terms of quantity, quality, information type, provenance and likely relevance. Once a set of relevant libraries has been selected the user must organise and interpret the information in a common format and environment. Typically this is performed through visual evaluation and ad hoc integration which forces users to restrict their attention to a small subset of the information retrieved.

MIND will assist users to know where to search, how to query different media, and how to combine information from diverse sources.

The University of Dortmund is responsible for three subtasks:

  1. Resource selection: Basis was the decision-theoretic framework [Fuhr:99b] (developed by Dortmund). Each database has assigned costs (covering retrieval quality, communication time, monetary costs). Given a query (containing the number of documents to retrieve), the task is to compute (for efficiency, this number should be zero mostly) for every database the number of documents to retrieve from that database. Of course, the sum should equal the user-specified number of documents to retrieve, and the overall costs should be minimised.

    This model was extended in MIND [Nottelmann/Fuhr:03a] . Major achievements are:

    • 2 new methods for estimating retrieval quality (simulated retrieval on a sample; assuming a normal distribution for the indexing weights)
    • relationship between probability of inference (RSV) and probability of relevance by a logistic (instead of a linear) function [Nottelmann/Fuhr:03e]
    • first evaluation, comparable quality compared to CORI, the state-of-the-art resource selection method
    • extension towards different data and media types beside text [Nottelmann/Fuhr:03c]
    • integration of CORI into the decision-theoretic framework
  2. Heterogeneity: The existing databases differ in terms of content and structure (schema [Fuhr:99] ) of its documents (e.g., they can distinct "editor" and "author"). Thus, the user query (specified against a global schema) must be translated for every database into a query fitting the database schema.

    This basic idea was extended and implemented within MIND [Nottelmann/Fuhr:03b] . Major achievements are:

    • modelling MIND queries and documents in DAML+OIL
    • defining uncertain schema mapping rules in Probabilistic Datalog
    • Transforming rules in XSLT stylesheets
    • Implementation of this approach
    • a first approach for learning the uncertain logical rules from examples [Nottelmann/Fuhr:01]
  3. Media type "facts": The project MIND covered four media types: text, images, facts (e.g. author names, numbers) and the transcripts of speech recognition. Dortmund was responsible for "facts".

    In most areas, handling facts is the same as handling "ordinary" text. Significant differences are in the resource selection part. Thus, we extended our decision-theoretic framework so that it can also estimates costs for several factual datatypes [Nottelmann/Fuhr:03c] .

You can find the MIND publications of our group below. The official MIND web site also contains the publications of all project partners.


Publications

J. Callan; F. Crestani; H. Nottelmann; P. Pala; X. M. Shou (2003).
Resource Selection and Data Fusion in Multimedia Distributed Digital Libraries (poster). In SIGIR:03

H. Nottelmann; N. Fuhr (2003).
From uncertain inference to probability of relevance for advanced IR applications. In ECIR:03

H. Nottelmann; N. Fuhr (2003).
Evaluating different methods of estimating retrieval quality for resource selection. In SIGIR:03

H. Nottelmann; N. Fuhr (2003).
Combining DAML+OIL, XSLT and probabilistic logics for uncertain schema mappings in MIND. In ECDL:03

H. Nottelmann; N. Fuhr (2003).
Decision-theoretic resource selection for different data types in MIND. In SIGIR-DIR:03

H. Nottelmann; N. Fuhr (2003).
The MIND Architecture for Heterogeneous Multimedia Federated Digital Libraries. In SIGIR-DIR:03

H. Nottelmann; N. Fuhr (2003).
From Retrieval Status Values to Probabilities of Relevance for Advanced IR Applications. Information Retrieval 6(4)

H. Nottelmann; P. Pala (2003).
MIND: A Graphical User Interface for Presenting Fused Results from Multi-Media Distributed Digital Libraries (poster). In ECDL:03

N. Fuhr; C.-P. Klas (2001).
Combining RDF and Agent-Based Architectures for Semantic Interoperability in Digital Libraries. In DELOS-Interoperability:01

H. Nottelmann; N. Fuhr (2001).
Learning probabilistic Datalog rules for information classification and transformation. In CIKM:01

H. Nottelmann; N. Fuhr (2001).
MIND: An architecture for multimedia information retrieval in federated digital libraries. In DELOS-Interoperability:01


Talks

Norbert Fuhr (2003).
Multimedia Information Retrieval in Networked Digital Libraries. Talk at the Perspectives Seminar ``Multimedia Retrieval'', Dagstuhl

Henrik Nottelmann (2003).
Probabilistic logics for defining and using P2P service descriptions. QMIR Seminar, London


Diploma, master and bachelor theses

Only in german!

Semiautomatisches Pflegen von Wrappern
Finished diploma thesis
Lernen unsicherer Regeln in HySpirit
Finished diploma thesis

Related projects

DAFFODIL
Distributed Agents for User-Friendly Access of Digital Libraries
Pepper
Peer-to-Peer Architectures for Federated Search of Complex Digital Libraries

Notes

Our deliverables