 |
History and research topics
|
 |
The IR group started 1991, when Norbert Fuhr was appointed
Professor at the Computer Science Department of the Universiy
of Dortmund. In 2002, the group moved to the University of
Duisburg-Essen.
The charter of the specialist IR group of the German
informatics society (GI) defines IR as a discipline which
deals with uncertainty and vagueness in all kinds of
information systems. Following this broad concept, our group is mainly interested in extending IR models
and methods for dealing with problems beyond the classical
text retrieval task. In particular, the combination of
concepts from IR and database systems is an ongoing theme of
our work, with applications such as relational
databases, multimedia information systems, distributed digital
libraries and XML documents.
As theoretic background for the new types of applications, our
group combined Norbert Fuhr's earlier work on probabilistic IR
models with logic-based approaches. A major result of this
work was the development of probabilistic Datalog during the
ESPRIT project FERMI (1994-97), which
focused on retrieval methods for multimedia documents. Based
on this model, the retrieval engine HySprit was implemented,
which offers flexible and efficient retrieval mechanisms even
for large data sets. (Thomas Rölleke, a former member of our
group, used HySpirit to found a startup company with the
same name in 1999).
Besides multimedia systems, digital libaries (DLs) became an
important new application area for IR methods (with new
funding opportunities) in the mid-90s. Our group has been
active in this area since 1995, and today, IR methods for DLs
are the major focus of this group. Past and present work in
this area centers around four major themes:
-
Networked IR
-
In the projects Medoc
(1995-97), Interdoc (1998)
and MIND (2001-02), the group
worked on the development of new probabilistic models for
resource selection and result fusion, addressed the issue of
heterogeneity wrt. database schemas and retrieval methods,
and extended these approaches for retrieving multimedia
data. An alternative scheme was investigated in the CYCLADES project
(2001-03), where metadata from Open Archives are gathered in
a central server, which offers searching and browsing for
content-oriented subsets of the records collected..The current Pepper (2003-6) project
investigates the application of the various concepts in
peer-to-peer networks.
-
XML retrieval
-
Starting with CARMEN project
(1999-2001) and continued in the CLASSIX
project,(2002-04), we are developing IR methods for XML documents. A major result
is the development of the query language XIRQL and its
implementation within the new retrieval engine HyREX. Current work
in this area focuses on interactive retrieval and clustering
of XML documents.
-
User-oriented retrieval methods
-
Based on the ideas of Bates et al., the DAFFODIL project (2000-04)
develops a new frontend for federated digital libraries that
supports high-level search activities in an adaptive and
proactive way.
-
Evaluation of DLs
-
Within the DELOS Network of
Excellence (2000-3), (2004-7), Norbert
Fuhr leads the evaluation workpackage aiming at the
development of evaluation methods and testbeds for digital
libraries. Part of these activities is the INEX
initiative for the evaluation of XML retrieval, which provides a
testbed for fulltext retrieval of XML documents.
See also the featuring of our group
within the InformeR
1/2002 of the BCS IR Specialist
Group.
|