Advanced information access to parliamentary debates

Open Access
Authors
Publication date 2009
Journal JoDI, Journal of Digital Information
Volume | Issue number 10 | 6
Number of pages 11
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Parliamentary debates are highly structured transcripts of meetings of politicians in parliament. These debates are an important part of the cultural heritage of many countries; they are often free of copy-right; citizens often have a legal right to inspect them; and several countries make great effort to digitize their entire historical collection and make it available to the general public. This provides many opportunities for the Information Retrieval community.
In this paper, we analyze the structure of parliamentary proceedings and sketch a widely applicable DTD. We show how proceedings in PDF format can be transformed into deeply nested XML.
Having the proceedings in XML makes a wide range of applications possible. We elaborate on five applications: entry point retrieval, advanced content and structure search; automatic creation of tables of contents and hyperlinked navigation menus; graphical result aggregation; large savings on storage space and bandwidth for scanned documents.
Document type Article
Note http://journals.tdl.org/jodi/article/view/668
Language English
Published at http://ilps.science.uva.nl/sites/default/files/marx-adva09.pdf
Downloads
320599.pdf (Final published version)
Permalink to this page
Back