Semantic Cluster Analysis in Information Retrieval

From 01. 07. 2009 until 30. 09. 2014
Contact Persons:
Involved Persons:
Sponsored by:
  • DFG
Reference number:
  • DFG: FU 205/22-1
  • UDE: ka00043i
Participating Institutions:

Clustering methods combine an object model, a similarity metrics and a fusion principle, where the latter is the focus of current research.

For more advanced problems, clustering can only be successful when the three elements are combined in a meaningful way and knowledge about both the analysis task and the user is considered. This principle of 'semantic clustering' will allow for solving clustering problems in IR in a more efficient and effective way than current methods.

This project aims at investigating the theoretical, methodological and experimental aspect of this problem. Hereby 'semantics' will play multiple roles:

  1. in the form of specialized retrieval models which consider knowledge
  2. about the IR task at hand,
  3. by integrating domain knowledge
  4. as ensemble clustering, i.e. combining fusion methods,
  5. from the user when performing interactive or multi-clustering.

Finally, semantics will form the basis for cluster labeling - which currently forms the biggest challenge in document clustering.

Additional information:




Diploma, Master and Bachelor theses

Only in german!

Related projects

ezDL is framework for interactive search systems


In order to perform and evaluate user studies, an environment for experiments has been developed, which is based on the framework for interactive search systems ezDL and uses AItools to cluster the documents of a retrieval result and resent them to the user.