Zitationsschlüssel:
Abdulmutalib/Fuhr:08
Titel:
Language Models and Smoothing Methods for Collections with Large Variation in Document Length
Autor(en):
Najeeb Abdulmutalib
Norbert Fuhr
In:
Zitationsschlüssel:
Tjoa/Wagner:08
Titel:
19th International Workshop on Database and Expert Systems Applications (DEXA 2008), 1-5 September 2008, Turin, Italy
Herausgeber:
A M. Tjoa
R. R. Wagner
Verlag:
IEEE Computer Society
In:
DEXA Workshops
Jahr:
2008

BibTeX-Eintrag

Seite(n):
9-14

Zusammenfassung:
In this paper we present a new language model based on an odds formula, which explicitly incorporates document length as a parameter. Furthermore, a new smoothing method called exponential smoothing is introduced, which can be combined with most language models. We present experimental results for various language models and smoothing methods on a collection with large document length variation, and show that our new methods compare favorably with the best approaches known so far.

BibTeX-Eintrag

Volltext als PDF