Filtering and clustering XML retrieval results

J. Kamps; M. Koolen; B. Sigurbjörnsson

doi:https://doi.org/10.1007/978-3-540-73888-6_13

Filtering and clustering XML retrieval results

Authors	J. Kamps M. Koolen B. Sigurbjörnsson
Publication date	2007
Host editors	N. Fuhr M. Lalmas A. Trotman
Book title	Comparative Evaluation of XML Information Retrieval Systems
Book subtitle	5th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2006, Dagstuhl Castle, Germany, December 17-20, 2006 : revised and selected papers
ISBN	9783540738879
ISBN (electronic)	9783540738886
Series	Lecture Notes in Computer Science
Event	Comparative evaluation of XML information retrieval systems : 5th international workshop of the initiative for the evaluation of XML retrieval, INEX 2006
Pages (from-to)	121-136
Publisher	Berlin: Springer
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI) Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	As part of the INEX 2006 Adhoc Track, we conducted a range of experiments with filtering and clustering XML element retrieval results. Our basic retrieval engine retrieves arbitrary elements from the collection (corresponding to the Thorough Task). These runs are filtered to remove textual overlap between elements (corresponding to the Focused Task). The resulting runs can be clustered per article (corresponding to the All in Context Task). Finally, we select the “best” element for each article (corresponding to the Best in Context Task). Our main findings are the following. First, a complete element index outperforms a restricted index based on section-structure, albeit the differences are small. Second, grouping non-overlapping elements per article does not lead to performance degradation, but may improve scores. Third, all restrictions of the “pure” element runs (by removing overlap, by grouping elements per article, or by selecting a single element per article) lead to some but only moderate loss of precision.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.1007/978-3-540-73888-6_13 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Filtering and clustering XML retrieval results