Targeted audience
  • AI Master
  • Ability to read and understand papers written in English.
  • Ability to perform academic writing.
  • Strong programming skills (e.g.Java, essential)
  • Lectures Information Mining (essential)

Task description

In shopping sites like Amazon users can write reviews about the products. The reviews are usually short and can focus on various aspect of the topic. E.g. camera reviews might focus on aspects like the battery life of the camera, the display, etc. For a handful of reviews it is straightforward to read the reviews and get some understanding what aspects the reviews talk about. For each aspect the user will also have an opinion whether the reviews are positive or negative and finally based on all the opinions will make a decision about purchasing the product. However, when the number of reviews is huge reading them all and getting the overall picture is time consuming. It is desired that this procedure is automated. An automated process would have the structure that first all reviews are clustered by aspects. That means all reviews talking about the same aspect is grouped by that aspect. Next, each group is labeled by that aspect. This means the reviews in each group are analysed by an automated process and the aspect the reviews talk about is determined. The result of this process (name of the aspect) is used to label the group. Finally, the opinions (positive or negative) are extracted for each aspect group and the results presented to the user. The focus of this project is on group/cluster labeling, i.e. generating the aspect by analysing the reviews within the cluster. It is assumed that there exist clusters of reviews. The student should develop an approach that generates or extracts for each cluster a good label. This label will serve as the aspect of the cluster, i.e. batter life of a camera.