But How Do We Store It? (Big) Data Architecture in the Social-Scientific Research Process

Open Access
Authors
Publication date 2018
Host editors
  • C.M. Stuetzer
  • M. Welker
  • M. Egger
Book title Computational Social Science in the Age of Big Data
Book subtitle Concepts, Methodologies, Tools, and Applications
ISBN
  • 9783869622675
ISBN (electronic)
  • 9783869622682
Series Neue Schriften Zur Online-Forschung
Pages (from-to) 161-187
Publisher Köln: Herbert von Halem Verlag
Organisations
  • Faculty of Science (FNWI) - Informatics Institute (IVI)
  • Faculty of Social and Behavioural Sciences (FMG) - Amsterdam School of Communication Research (ASCoR)
Abstract

The social-scientific research process is usually considered to consist of reviewing literature and theory, followed by the generation of research questions or hypotheses, the collection of data, their analysis, and writing up the findings. In this chapter, we argue that in the age of Big Data, social scientists have to increasingly consider a step that can be located between the collection and analysis of data: the storage of the data. Based on the notion of data architecture, we discuss how the choices made at his stage impact the ways the data can be used and the research questions that can be answered. In particular, we compare file dumps, relational databases, document stores, and graph databases. We develop a scheme to make a choice for one of these approaches based on four criteria: the need for preprocessing, the properties of the data, the research design, the available infrastructure, and the available expertise. We conclude by summarizing their strengths and weaknesses along two dimensions: ease-of-storage versus reliability-of-retrieval and ease-of-use versus power-to-explore.

Document type Chapter
Language English
Other links https://www.halem-verlag.de/computational-social-science-in-the-age-of-big-data/
Downloads
chapter (Submitted manuscript)
Permalink to this page
Back