Universität Duisburg-Essen
Startseite Arbeitsgruppe Informationsysteme

HyREX

Hyper-media Retrieval Engine for XML


Contact Persons:
Involved Persons:

XML is the emerging standard for representing knowledge in almost arbitrary applications. At least almost every kind of knowledge can be represented in XML. For exploring such knowledge, one needs a search engine which is able to let users benefit from all of the concepts with which XML blesses the world.

HyREX is the Hyper-media Retrieval Engine for XML [Abolhassani/etal:02] . The HyREX project is an ongoing effort (funded as part of other projects like e. g. CARMEN, CYCLADES, and CLASSIX) for developing an information retrieval engine for XML documents. HyREX's main characteristics can be derived from the constituents of its name:

hyper
HyREX offers explicit and implicit links to the user. Explicit links are specified within the documents, usually by means of XML linking standards, such as XLink and XPointer. Implicit links are intrinsic to information structures which \hyrex derives from XML document collections.
media
HyREX offers search facilities for text, but also for other media than text, at least conceptually.
retrieval engine
HyREX allows users to explore all kinds of information structures available through XML; besides retrieval in XML documents it allows for browsing and searching the domains of attributes of XML documents as well as schema information given for example by the DTD of a document collection.
XML
HyREX allows retrieval under consideration of content and structure inherent in XML documents.

Architecture

HyREX's architecture is similar to that of database management systems. Thus, there is a clear separation between the logical and the physical level. The physical layer HyPath deals with efficient access paths for retrieval, while the logical layer deals with the XIRQL query language. On top of these layers is HyGate, the user interface to HyREX applications.

In the following we give a brief outline on the characteristics of the levels.

Architecture
HyGate
  • User interface for searching and browsing
  • Query formulation assistant
  • Presentation of retrieval results
XIRQL
  • XML Information Retrieval Query Language
  • extends XPath with IR capabilities
  • Weighted document content and query conditions
  • Ranking for search results
  • Powerful searching for any type of information
  • Relevance-oriented search
HyPath
  • Efficient access paths for content and structure
  • application specific selection of access paths

Publications

Mohammad Abolhassani; Norbert Fuhr; Saadia Malik (2004).
HyREX at INEX 2003. In INEX:04

M. Abolhassani; N. Fuhr (2004).
Applying the Divergence From Randomness Approach for Content-Only Search in XML Documents. In ECIR:04

N. Fuhr; K. Großjohann (2004).
XIRQL: An XML Query Language Based on Information Retrieval Concepts. ACM Transactions on Information Systems 22

N. Fuhr; K. Großjohann; S. Kriewel (2003)
A Query Language and User Interface for XML Information Retrieval.

Norbert Gövert; Norbert Fuhr; Mohammad Abolhassani; Kai Großjohann (2003).
Content-oriented XML retrieval with HyREX. In INEX:03

Mohammad Abolhassani; Norbert Fuhr; Norbert Gövert; Kai Großjohann (2002).
HyREX: Hypermedia Retrieval Engine for XML. Research Report , University of Dortmund, Department of Computer Science, Dortmund, Germany

Norbert Fuhr; Norbert Gövert; Kai Großjohann (2002).
HyREX: Hyper-media Retrieval Engine for XML. In SIGIR:02

Norbert Fuhr; Norbert Gövert (2002).
Index Compression vs. Retrieval Time of Inverted Files for XML Documents. In CIKM:02

Norbert Fuhr; Norbert Gövert (2002).
Index Compression vs. Retrieval Time of Inverted Files for XML Documents. Technical Report , University of Dortmund

N. Fuhr; K. Großjohann (2002).
XIRQL: An XML Query Language Based on Information Retrieval Concepts. (Submitted for publication)

K. Großjohann; N. Fuhr; D. Effing; S. Kriewel (2002).
Query Formulation and Result Visualization for XML Retrieval. In: Proceedings ACM SIGIR 2002 Workshop on XML and Information Retrieval, ACM

K. Großjohann; N. Fuhr; D. Effing; S. Kriewel (2002).
A User Interface for XML Document Retrieval. In: Informatik 2002

N. Fuhr; K. Großjohann (2001).
XIRQL: A Query Language for Information Retrieval in XML Documents. In SIGIR:01

Norbert Gövert (2001).
Bilingual Information Retrieval with HyREX and Internet Translation Services. In CLEF:01

K. Großjohann (2001).
Physical Algebra. Research Report , University of Dortmund

N. Fuhr; K. Großjohann; S. Kokkelink (2000).
CAP7 -- Searching and Browsing in Distributed Document Collections. In ECDL:00

N. Fuhr; K. Großjohann (2000).
XIRQL -- An Extension of XQL for Information Retrieval. In SIGIR/XML:00

N. Fuhr; N. Gövert; Th. Rölleke (1998).
DOLORES: A System for Logic-Based Retrieval of Multimedia Objects. In SIGIR:98

Norbert Fuhr (2002).
HyREX: A Hyper-media Retrieval Engine for XML. Talk at the workshop "Carmen--Next Steps" in Osnabrück, January 16-18 (in German).
[ PDF | PPT ]
Norbert Gövert, Kai Großjohann (2002):
HyREX Manual
Norbert Fuhr (2002):
XIRQL: A Query Language for Information Retrieval in XML documents Talk at the University of Freiburg, July 2002 (partly in German).
[ PPT ]
Norbert Fuhr (2002):
XIRQL: Eine Anfragesprache für Information Retrieval in XML-Dokumenten Vortrag an der Humboldt-Universität Berlin 2002
[ PPT PDF ]

Talks

Mohammad Abolhassani; Norbert Fuhr; Saadia Malik (2003).
HyREX at INEX 2003. Talk at the INEX Workshop, Dagstuhl


Diploma, master and bachelor theses

Only in german!

Optimierung der Prozessierung von XIRQL-Anfragen
Finished diploma thesis
Entwicklung von Suchprädikaten für technische Texte
Finished diploma thesis
Unterstützung von Nutzern bei der Erstellung von XIRQL-Anfragen
Finished diploma thesis
Effiziente Wahrscheinlichkeitsberechnung für Ereignisausdrücke
Finished diploma thesis
Entwicklung und Implementierung von Retrievalmethoden für strukturierte Dokumente
Finished diploma thesis
Visualisierung für Retrieval von XML-Dokumenten
Finished diploma thesis
Entwicklung einer Benutzerschnittstelle zu einem Hypertext-Information-Retrieval-System
Finished diploma thesis
Entwicklung einer Audio-Retrieval-Komponente für ein multimediales IR-System
Finished diploma thesis
Integration einer XML-Retrievalengine in Hyperwave
Finished diploma thesis
Entwurf, Realisierung und Evaluierung von linguistischen Suchprädikaten für HyREX
Finished diploma thesis
Effektive und effiziente Updates in invertierten Dateien
Finished diploma thesis
Realisierung physikalischer Unabhängigkeit in einem IR-System
Finished diploma thesis

Related projects

CARMEN WP 7
Content Analysis, Retrieval and Metadata: Effective Networking
Work Package 7: A Document Referencing and Linking System
CLASSIX
Classification and Intelligent Search on Information in XML
FOCUS
Focussed retrieval of structured documents
INEX
Initiative for the Evaluation of XML retrieval

Software

Latest release of the HyREX source code.

Testbeds

Test collections and prototype applications.