Applying automatically parsed corpora to the study of language variation

J. Bloem; A. Versloot; F. Weerman

Applying automatically parsed corpora to the study of language variation

Authors	J. Bloem A. Versloot F. Weerman
Publication date	2014
Host editors	J. Tsujii J. Hajic
Book title	COLING 2014: the 25th International Conference on Computational Linguistics
Book subtitle	proceedings of COLING 2014 : technical papers: August 23-29, 2014, Dublin, Ireland
ISBN	9781941643266
Event	COLING 2014
Pages (from-to)	1974-1984
Publisher	Sroudsburg, PA: Association for Computational Linguistics
Organisations	Faculty of Humanities (FGw) - Amsterdam Institute for Humanities Research (AIHR) - Amsterdam Center for Language and Communication (ACLC)
Abstract	In this work, we discuss the benefits of using automatically parsed corpora to study language variation. The study of language variation is an area of linguistics in which quantitative methods have been particularly successful. We argue that the large datasets that can be obtained using automatic annotation can help drive further research in this direction, providing sufficient data for the increasingly complex models used to describe variation. We demonstrate this by replicating and extending a previous quantitative variation study that used manually and semi-automatically annotated data. We show that while the study cannot be replicated completely due to limitations of the existing automatic annotation, we can draw at least the same conclusions as the original study. In addition, we demonstrate the flexibility of this method by extending the findings to related linguistic constructions and to another domain of text, using additional data.
Document type	Conference contribution
Language	English
Published at	http://www.aclweb.org/anthology/C14-1186 (Final published version)
Downloads	C14-1186 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Applying automatically parsed corpora to the study of language variation