Citation-Key:
Ernst/Fuhr:06
Title:
Generating Search Term Variants for Text Collections with Historic Spellings
Author(s):
Andrea Ernst-Gerlach
Norbert Fuhr
In:
Citation-Key:
ECIR:06
Title:
28th European Conference on Information Retrieval Research (ECIR 2006)
Editor(s):
Mounia Lalmas
Andy MacFarlane
Stefan M. Rüger
Anastasios Tombros
Theodora Tsikrika
Alexei Yavlinsky
Publisher:
Springer
In:
ECIR
Year:
2006

BibTeX entry

Year:
2006

Abstract:
In this paper, we describe a new approach for retrieval in texts with non-standard spelling, which is important for historic texts in English or German. For this purpose, we present a new algorithm for generating search term variants in ancient orthography. By applying a spell checker on a corpus of historic texts, we generate a list of candidate terms for which the contemporary spellings have to be assigned manually. Then our algorithm produces a set of probabilistic rules. These probabilities can be considered for ranking in the retrieval stage. An experimental comparison shows that our approach outperforms competing methods.

BibTeX entry

Fulltext as PDF