From documents to data: linked data at the Dutch Parliament

Open Access
Authors
Publication date 2010
Book title Online Information 2010: Proceedings. Discover new ways of working in the linked and social web
ISBN
  • 1900239914
  • 9781900239912
Event Online Information 2010, London, UK
Pages (from-to) 17-22
Publisher Incisive Media
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract
Parliamentary debates are important for the general public and for scientific research in numerous fields, such as political science, historical science, linguistics and communication; they are an interesting domain to apply state-of-the-art information retrieval technology.
Parliamentary debates are highly structured transcripts of meetings of politicians in parliament. These debates are an important part of the cultural heritage of countries; they are often free of copyright; citizens often have a legal right to inspect them; and several countries make great effort to digitise their entire historical collection and open that up to the general public.
In this paper, we analyse the structure of the parliamentary proceedings, show how proceedings in PDF format can be transformed into XML and describe the use of permanent identifiers for entities in parliamentary texts. Having the proceedings in XML makes a wide range of applications possible. We elaborate on four of these: entry point retrieval, advanced content and structure search; automatic creation of tables of contents and hyperlinked navigation menus; and large savings on storage space and bandwidth for scanned documents. We also describe the benefits of this approach for the so-called transparency of the parliamentary process for citizens and stakeholders.
Document type Conference contribution
Language English
Published at http://www.online-information.co.uk/online2010/conference/conference_presentation_2010.html?presentation_id=1181
Downloads
332672.pdf (Final published version)
Permalink to this page
Back