Contextualized word embeddings expose ethnic biases in news

Authors	G. Thijs D. Trilling A.C. Kroon
Publication date	2024
Book title	WEBSCI '24 : Reflecting on the Web, AI, and Society
Book subtitle	Proceedings of the 16th ACM Web Science Conference 2024 : May 21-24, 2024 : University of Stuttgart, Germany
ISBN (electronic)	9798400703348
Event	16th ACM Web Science Conference 2024
Pages (from-to)	290-295
Publisher	New York, New York: Association for Computing Machinery
Organisations	Faculty of Social and Behavioural Sciences (FMG) - Amsterdam School of Communication Research (ASCoR)
Abstract	The web is a major source for news and information. Yet, news can perpetuate and amplify biases and stereotypes. Prior work has shown that training static word embeddings can expose such biases. In this short paper, we apply both a conventional Word2Vec approach as well as a more modern BERT-based approach to a large corpus of Dutch news. We demonstrate that both methods expose ethnic biases in the news corpus. We also show that the biases in the news corpus are considerably stronger than the biases in the transformer model itself.
Document type	Conference contribution
Note	With supplemental material
Language	English
Published at	https://doi.org/10.1145/3614419.3643994
Downloads	3614419.3643994 (Final published version)
Supplementary materials	PaperSession-5_Hate_Speech_einzeln_Donnerstag_240605_DamianTrilling
Permalink to this page

Back

UvA-DARE