 |
MIND
|  |

Resource Selection and Data Fusion for Multimedia International
Digital Libraries
- Duration:
-
From 01.
02.
2001
until 31.
12.
2003
- Contact Persons:
- Involved Persons:
- Sponsored by:
- Reference number:
- IST-2000-26061, 0415053 (Dortmund), 15311571 (Duisburg)
- Participating Institutions:
This research addresses problems associated with the emergence
of thousands of heterogeneous multimedia Digital libraries
distributed internationally on multiple platforms. Users have
problems with resource selection as they are unaware of the
contents of each individual library in terms of quantity,
quality, information type, provenance and likely
relevance. Once a set of relevant libraries has been selected
the user must organise and interpret the information in a
common format and environment. Typically this is performed
through visual evaluation and ad hoc integration which forces
users to restrict their attention to a small subset of the
information retrieved.
MIND will assist users to know where to search, how to query
different media, and how to combine information from diverse
sources.
The University of Dortmund is responsible for three
subtasks:
Resource selection:
Basis was the decision-theoretic framework
[Fuhr:99b]
(developed by Dortmund). Each database has
assigned costs (covering retrieval quality, communication
time, monetary costs). Given a query (containing the number
of documents to retrieve), the task is to compute (for
efficiency, this number should be zero mostly) for every
database the number of documents to retrieve from that
database. Of course, the sum should equal the user-specified
number of documents to retrieve, and the overall costs
should be minimised.
This model was extended in MIND
[Nottelmann/Fuhr:03a]
. Major achievements are:
- 2 new methods for estimating retrieval quality
(simulated retrieval on a sample; assuming a normal
distribution for the indexing weights)
- relationship between probability of inference (RSV)
and probability of relevance by a logistic (instead of
a linear) function
[Nottelmann/Fuhr:03e]
- first evaluation, comparable quality compared to
CORI, the state-of-the-art resource selection method
- extension towards different data and media types
beside text
[Nottelmann/Fuhr:03c]
- integration of CORI into the decision-theoretic
framework
Heterogeneity:
The existing databases differ in terms of content and
structure (schema
[Fuhr:99]
)
of its documents (e.g., they can distinct "editor" and
"author"). Thus, the user query (specified against a global
schema) must be translated for every database into a query
fitting the database schema.
This basic idea was extended and implemented within MIND
[Nottelmann/Fuhr:03b]
. Major achievements are:
- modelling MIND queries and documents in DAML+OIL
- defining uncertain schema mapping rules in
Probabilistic Datalog
- Transforming rules in XSLT stylesheets
- Implementation of this approach
- a first approach for learning the uncertain logical
rules from examples
[Nottelmann/Fuhr:01]
Media type "facts":
The project MIND covered four media types: text, images,
facts (e.g. author names, numbers) and the transcripts of
speech recognition. Dortmund was responsible for "facts".
In most areas, handling facts is the same as handling
"ordinary" text. Significant differences are in the
resource selection part. Thus, we extended our
decision-theoretic framework so that it can also estimates
costs for several factual datatypes
[Nottelmann/Fuhr:03c]
.
You can find the MIND publications of our group below. The
official MIND web site also contains the
publications
of all project partners.
Publications- J. Callan; F. Crestani; H. Nottelmann; P. Pala; X. M. Shou (2003).
- Resource Selection and Data Fusion in Multimedia Distributed Digital Libraries (poster). In SIGIR:03
- H. Nottelmann; N. Fuhr (2003).
- From uncertain inference to probability of relevance for advanced IR applications. In ECIR:03
- H. Nottelmann; N. Fuhr (2003).
- Evaluating different methods of estimating retrieval quality for resource selection. In SIGIR:03
- H. Nottelmann; N. Fuhr (2003).
- Combining DAML+OIL, XSLT and probabilistic logics for uncertain schema mappings in MIND. In ECDL:03
- H. Nottelmann; N. Fuhr (2003).
- Decision-theoretic resource selection for different data types in MIND. In SIGIR-DIR:03
- H. Nottelmann; N. Fuhr (2003).
- The MIND Architecture for Heterogeneous Multimedia Federated Digital Libraries. In SIGIR-DIR:03
- H. Nottelmann; N. Fuhr (2003).
- From Retrieval Status Values to Probabilities of Relevance for Advanced IR Applications. Information Retrieval 6(4)
- H. Nottelmann; P. Pala (2003).
- MIND: A Graphical User Interface for Presenting Fused Results from Multi-Media Distributed Digital Libraries (poster). In ECDL:03
- N. Fuhr; C.-P. Klas (2001).
- Combining RDF and Agent-Based Architectures for Semantic Interoperability in Digital Libraries. In DELOS-Interoperability:01
- H. Nottelmann; N. Fuhr (2001).
- Learning probabilistic Datalog rules for information classification and transformation. In CIKM:01
- H. Nottelmann; N. Fuhr (2001).
- MIND: An architecture for multimedia information retrieval in federated digital libraries. In DELOS-Interoperability:01
Talks- Norbert Fuhr (2003).
- Multimedia Information Retrieval in Networked Digital Libraries. Talk at the Perspectives Seminar ``Multimedia Retrieval'', Dagstuhl
- Henrik Nottelmann (2003).
- Probabilistic logics for defining and using P2P service descriptions. QMIR Seminar, London
Diploma, master and bachelor thesesOnly in german! -
Semiautomatisches Pflegen von Wrappern
- Finished diploma thesis
- Lernen unsicherer Regeln in HySpirit
- Finished diploma thesis
Related projects-
DAFFODIL
-
Distributed Agents for User-Friendly Access of Digital Libraries
-
Pepper
-
Peer-to-Peer Architectures for Federated Search of Complex
Digital Libraries
Notes
Our deliverables
|