Exploring topic-based language models for effective web information retrieval

Open Access
Authors
Publication date 2008
Book title Proceedings of the 8th Dutch-Belgian Information Retrieval Workshop (DIR 2008)
ISBN
  • 9789056812829
Event 8th Dutch-Belgian Information Retrieval Workshop (DIR 2008), Maastricht, the Netherlands
Pages (from-to) 65-71
Publisher Maastricht: University of Maastricht
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
The main obstacle for providing focused search is the relative opaqueness of search request—searchers tend to express their complex information needs in only a couple of keywords. Our overall aim is to find out if, and how, topic-based language models can leads to more effective web information retrieval. In this paper we explore retrieval performance of a topic-based model that combines topical models with other language models based on cross-entropy. We first define our topical categories and train our topical models on the .GOV2 corpus by building parsimonious language models. We then test the topic-based model on TREC8 small Web data collection for ad-hoc search. Our experimental results show that the topic-based model outperforms the standard language model and parsimonious model.
Document type Conference contribution
Published at http://riannekaptein.woelmuis.nl/2008/li-expl08.pdf
Downloads
Permalink to this page
Back