- Jens Landmann
- Targeted audience
- AI Master
- Ability to read and understand papers written in English.
- Ability to perform academic writing.
- Strong programming skills (e.g.Java, essential)
- Lectures Information Retrieval oder Information Mining and the use of tools such as RapidMiner (essential)
Social media platforms allow millions of internet users to easily create and share multi-media content. This generates a continuously increasing volume of big data that harbours precious knowledge of the crowds. Much of crowd wisdom is bundled up in arguments, i.e. claims that are supported or refuted by evidence. This evidential data could be used to answer questions, understand complex phenomena or evaluate services and products - if it was easily accessible. However, currently, analytic tools can only tell what users report in big data, not why.
This master project will contribute towards developing tools for automatic extraction of relevant and reliable arguments from big data (e.g. news articles, comments, wikipedia articles, etc.).
This project will involve the acquisition of data. Once the data is collected it should be manually annotated for arguments, i.e. claims and evidences for claims should be extracted. Finally, supervised machine learning should be used to perform the extraction of arguments automatically. Here the student should investigate different features (attributes) as well as machine learning algorithms. Each of these should be evaluated against the gold standard data (the manually annotated data).
- literature scan. This should be done before the actual project starts. Here the student will be given some initial papers. Based on these papers the student should collect more papers, perform a review of all the papers and prepare an oral presentation of 30 mins. providing an intro to the field. This should take 2-3 weeks. Actual work:
- Acquisition of data. This could be done either automatically or manually.
- Preprocessing of data. This can be done automatically using Natural Language Processing techniques.
- Manual annotation of the data. This means that the data collected above needs to be annotated for arguments.
- Performing the argument mining. The student should perform automatic feature extraction and apply machine learning to perform the argument extraction automatically. Results of the automatic system should be evaluated using precision and recall.