Data for The Generally Curious: Thematically Distinct Datasets of 4chan's /pol/ Discussion Forum's 'General Threads'

Creators
Publication date 14-01-2020
Description
Over the second half of the 2010s, the /pol/ (‘politically incorrect’) forum on the 4chan image board has emerged as a space within which various extreme political ideologies are discussed and cultivated, occasionally informing off-site acts of political extremism. While previous research has often studied this space as a unified whole, it is relevant to more specifically demarcate different publics within 4chan’s /pol/ board, apart from studying it as an ‘amorphous blob’. This paper focuses specifically on ‘generals’ - recurring threads with a specific thematic focus identified by a particular vernacular phrase or tag. By identifying them it is possible to subset the board’s archive into multiple distinct datasets comprising discussions about a particular topic, such as Donald Trump, the Syria war, or British politics. We provide a dataset containing 58,841 opening posts and 13,697,738 replies to those, divided over 329 thematically distinct ‘general thread’ collections. In this paper we outline our data collection and query protocol, the structure of the data and its rationale, as well as a number of suggested research uses for this new data.
Publisher Zenodo
Organisations
  • Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR) - Amsterdam School for Cultural Analysis (ASCA)
  • Faculty of Social and Behavioural Sciences (FMG) - Amsterdam Institute for Social Science Research (AISSR)
Document type Dataset
Related dataset Data for The Generally Curious: Thematically Distinct Datasets of 4chan's /pol/ Discussion Forum's 'General Threads'
Related publication Generally Curious: Thematically Distinct Datasets of General Threads on 4chan/pol/
DOI https://doi.org/10.5281/zenodo.3603291
Other links https://zenodo.org/record/3603292
Permalink to this page
Back