Contextualized word embeddings expose ethnic biases in news
| Authors |
|
|---|---|
| Publication date | 2024 |
| Book title | WEBSCI '24 : Reflecting on the Web, AI, and Society |
| Book subtitle | Proceedings of the 16th ACM Web Science Conference 2024 : May 21-24, 2024 : University of Stuttgart, Germany |
| ISBN (electronic) |
|
| Event | 16th ACM Web Science Conference 2024 |
| Pages (from-to) | 290-295 |
| Publisher | New York, New York: Association for Computing Machinery |
| Organisations |
|
| Abstract | The web is a major source for news and information. Yet, news can perpetuate and amplify biases and stereotypes. Prior work has shown that training static word embeddings can expose such biases. In this short paper, we apply both a conventional Word2Vec approach as well as a more modern BERT-based approach to a large corpus of Dutch news. We demonstrate that both methods expose ethnic biases in the news corpus. We also show that the biases in the news corpus are considerably stronger than the biases in the transformer model itself. |
| Document type | Conference contribution |
| Note | With supplemental material |
| Language | English |
| Published at | https://doi.org/10.1145/3614419.3643994 |
| Downloads |
3614419.3643994
(Final published version)
|
| Supplementary materials | |
| Permalink to this page | |