Using parsimonious language models on web data

Authors
Publication date 2008
Host editors
  • S.H. Myaeng
  • D.W. Oard
  • F. Sebastiani
  • T.S. Chua
  • M.K. Leong
Book title ACM SIGIR 2008: 31st annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 20-24, 2008, Singapore: Proceedings
ISBN
  • 9781605581644
Event 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2008), Singapore
Pages (from-to) 763-764
Publisher New York, NY: Association for Computing Machinery (ACM)
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract In this paper we explore the use of parsimonious language models for web retrieval. These models are smaller thus more efficient than the standard language models and are therefore well suited for large-scale web retrieval. We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets and measures. In all cases the improvement is significant, and more substantial than in earlier experiments on newspaper/newswire data.
Document type Conference contribution
Published at http://doi.acm.org/10.1145/1390334.1390491
Permalink to this page
Back