Universität Duisburg-Essen
Startseite Arbeitsgruppe Informationsysteme

Document clustering based on non-negative matrix factorization

Zitationsschlüssel:
Xu/etal:03
Titel:
Document clustering based on non-negative matrix factorization
Autor(en):
Wei Xu
Xin Liu
Yihong Gong
Verlag:
ACM Press
In:
SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
In:
Zitationsschlüssel:
SIGIR:03
Titel:
Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Herausgeber:
Jamie Callan
Gordon Cormack
Charles Clarke
David Hawking
Alan Smeaton
Verlag:
ACM
In:
Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Jahr:
2003

BibTeX-Eintrag

Seite(n):
267--273
Jahr:
2003

Zusammenfassung:
In this paper, we propose a novel document clustering method based on the non-negative factorization of the term-document matrix of the given document corpus. In the latent semantic space derived by the non-negative matrix factorization (NMF), each axis captures the base topic of a particular document cluster, and each document is represented as an additive combination of the base topics. The cluster membership of each document can be easily determined by finding the base topic (the axis) with which the document has the largest projection value. Our experimental evaluations show that the proposed document clustering method surpasses the latent semantic indexing and the spectral clustering methods not only in the easy and reliable derivation of document clustering results, but also in document clustering accuracies.

BibTeX-Eintrag

Volltext