Minutes FOCUS Kickoff meeting


Contents


Related projects / theses

QMW
UNIDO

Evaluation

The first thing to do is to establish a test collection of reasonable size. This makes it easier to define criteria for an (user-centred) evaluation. Then the evaluation criteria need to be defined as well as measures.

Test collection

Unfortunately there are no standard test collections for SDR. In order to establich our own collection the following candidates have been proposed:
IEEE CDROM
IEEE used to sell contents of their DL on CDROM, fulltexts in SGML (Todo 2000-11-15: KG evaluates)
Springer (scientific publisher)
Springer has some material available in SGML. (Todo 2000-11-15: NF contacts Springer)
Dissertation Online projects: Networked Digital Library of Theses and Dissertations (Ed Fox), Digitale Dissertationen der Humboldt-Universität zu Berlin
both projects aim at providing full-texts of dissertations on the web. At least the Humboldt project provides them in XML (at this time about 100 dissertations available). (Todo 2000-11-15: NG evaluates).
Provides distance learning Materials. (Todo 2000-11-15: ML evaluates)
Smaller documents/collections
There are some smaller collections of XML/SGML structured documents: The Book of Mormon, The New Testament (KJV), The Old Testament (KJV), The Quran (trans. by M. H. Shakir) Plays of William Shakespeare, Linux Documentation Project. Further information on these collections (~ 10 MBytes) is available.

Evaluation criteria

With regard to Information Retrieval users should benefit from structure being included in documents in at least three ways:
retrieval of most specific document portions
instead of viewing documents as atomic units, structured document retrieval should aim at retrieving those portion(s) of a document which satisfies the user's information need most precisely. This can be a single node (subtree) in the document structure, several nodes or the whole document.
specify query conditions w. r. t. structure
In addition to specifying his/her information need in terms of condition w. r. t. content the user should be enabled to specify conditions w. r. t. document structure also.
appropriate presentation of results
not whole documents but also nodes (subtrees) in the documentstructure are valid reponses to a retrieval query. Possibly more than one node of a given document might be relevant. Such dependencies between different parts of a retrieval result should be visualized to the user appropriately. While in document-oriented information retrieval
Evaluation of our SDR approaches should aim at rating them in how far they are able to contribute to these objectives.

Evaluation measures

The evaluation criteria defined so far raise questions about appropriate relevance judgement processes and measures for these objectives.
appropriate relevance judgement processes
in traditional IR relevance of a document w. r. t. an information need is judged on a binary scale, i. e. relevant vs non-relevant. This doesn't look appropriate in a case where document nodes could be retrieved as a result as well as subnodes containd herein. In addition the relevance (non-relevance respectively) of a document traditionally is judged independently from the relevance of other documents within the retrieval result. This seems to be problematic in the case of document nodes which are related to each other in that they are parts from the same document. While for complete documents it can be assumed that they are self-contained this is not true for parts of them.
appropriate measurements
Typically the notion of recall and precision is used to rate retrieval systems and methods. These measurements heavily depend on partitioning the collection into a class of relevant documents and a class of non-relevant documents. Dependencies between documents are not considered. Especially, only linear retrieval results can be measured while a proper visualization of an SDR result will be at least two-dimenional.
These considerations lead to the following issues to be solved:

TREC Track on SDR?

Donna Harmann asked NG to help establish a new TREC track on SDR. It seems that until now nothing is fixed from NIST side. Their first issue is to locate some reasonable data to work with. Later they might put a committee together to design a track.

Our (Focus') aims w. r. t. a TREC track are:

(Todo: ML, NG: contact Donna Harmann and Ellen Vorhees and tell them about our ideas.)

SDR research

The discussion around a TREC track on SDR arose the question wether there are currently other groups doing SDR. Here the begin of an attempt to list those people and groups: (Todo: all: send further pointers)

Next steps

The following steps are to be taken:

Deliverables

Refinement of list of deliverables to be produced. Each deliverable should be in form of a published paper.
Phase 1: Test collection
D 1.1 Test collection and user criteria (JR leads)
Phase 2: Approaches to SDR
D 2.1 The focussed retrieval approach (ML leads)
D 2.2 The index node approach (KG leads)
D 2.3 Visualisation and user interface (JR leads)
Phase 3: Implementation
D 3.1 Implementation and pilot study (with batch-oriented evaluation measures) (NG leads)
Phase 4: User-centred and evaluation
D 4.1 User-centred Evaluation (JR leads, publication in HCI area)
D 4.2 Final report: System Evaluation (ML leads)

Visits

2000-11-06 - 2000-11-10
NG -> London (Meeting with RISO, UI discussion)
2000-12-?? - 2000-12-??
ML, GK -> Dortmund (Xmas market ;-)

Norbert Gövert,