AnthroSet: a Challenge Dataset for Anthropomorphic Language Detection

Open Access
Authors
  • Dorielle Lonke
  • J. Bloem ORCID logo
  • Pia Sommerauer
Publication date 2025
Host editors
  • P. PrzybyƂa
  • M. Shardlow
  • C. Colombatto
  • N. Inie
Book title Proceedings of the First Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models
Book subtitle associated with The 15th International Conference on Recent Advances in Natural Language Processing 2025 : OMMM 2025 : September 11th, 2025, Varna, Bulgaria
ISBN (electronic)
  • 9789544521011
Event Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models
Pages (from-to) 27-39
Publisher Shoumen: INCOMA Ltd.
Organisations
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
  • Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR)
Abstract This paper addresses the challenge of detecting anthropomorphic language in AI research. We introduce AnthroSet, a novel dataset of 600 manually annotated utterances covering various linguistic structures. Through the evaluation of two current approaches for anthropomorphism and atypical animacy detection, we highlight the limitations of a masked language model approach, arising from masking constraints as well as increasingly anthropomorphizing AIrelated terminology. Our findings underscore the need for more targeted methods and a robust definition of anthropomorphism.
Document type Conference contribution
Language English
Published at https://doi.org/10.26615/978-954-452-101-1-003
Published at https://acl-bg.org/proceedings/2025/OMMM%202025/pdf/2025.ranlpommm-1.3.pdf
Other links https://acl-bg.org/proceedings/2025/OMMM%202025/index.html
Downloads
2025.ranlpommm-1.3 (Final published version)
Permalink to this page
Back