Are Topically Diverse Documents Also Interesting?

Open Access
Authors
Publication date 2015
Host editors
  • J. Mothe
  • J. Savoy
  • J. Kamps
  • K. Pinel-Sauvagnat
  • G.J.F. Jones
  • E. SanJuan
  • L. Cappellato
  • N. Ferro
Book title Experimental IR Meets Multilinguality, Multimodality, and Interaction
Book subtitle 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8–11, 2015 : proceedings
ISBN
  • 9783319240268
ISBN (electronic)
  • 9783319240275
Series Lecture Notes in Computer Science
Pages (from-to) 215-221
Publisher Cham: Springer
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Text interestingness is a measure of assessing the quality of documents from users’ perspective which shows their willingness to read a document. Different approaches are proposed for measuring the interestingness of texts. Most of these approaches suppose that interesting texts are also topically diverse and estimate interestingness using topical diversity. In this paper, we investigate the relation between interestingness and topical diversity. We do this on the Dutch and Canadian parliamentary proceedings. We apply an existing measure of interestingness, which is based on structural properties of the proceedings (eg, how much interaction there is between speakers in a debate). We then compute the correlation between this measure of interestingness and topical diversity.

Our main findings are that in general there is a relatively low correlation between interestingness and topical diversity; that there are two extreme categories of documents: highly interesting, but hardly diverse (focused interesting documents) and highly diverse but not interesting documents. When we remove these two extreme types of documents there is a positive correlation between interestingness and diversity.
Document type Conference contribution
Language English
Published at https://doi.org/10.1007/978-3-319-24027-5_19
Downloads
Permalink to this page
Back