Evaluating the Consistency of Word Embeddings from Small Data

J. Bloem; A. Fokkens; A. Herbelot

doi:https://doi.org/10.26615/978-954-452-056-4_016

Evaluating the Consistency of Word Embeddings from Small Data

Authors	J. Bloem A. Fokkens A. Herbelot
Publication date	2019
Host editors	G. Angelova R. Mitkov I. Nikolova I. Temnikova
Book title	International Conference Recent Advances in Natural Language Processing : RANLP 2019
Book subtitle	Natural Language Processing in a Deep Learning World : Proceedings : Varna, Bulgaria, 2-4 September, 2019
ISBN	9789544520557
ISBN (electronic)	9789544520564
Event	Recent Advances in Natural Language Processing (RANLP) 2019
Pages (from-to)	132-141
Publisher	Shoumen: INCOMA Ltd.
Organisations	Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract	In this work, we address the evaluation of distributional semantic models trained on smaller, domain-specific texts, specifically, philosophical text. Specifically, we inspect the behaviour of models using a pre-trained background space in learning. We propose a measure of consistency which can be used as an evaluation metric when no in-domain gold-standard data is available. This measure simply computes the ability of a model to learn similar embeddings from different parts of some homogeneous data. We show that in spite of being a simple evaluation, consistency actually depends on various combinations of factors, including the nature of the data itself, the model used to train the semantic space, and the frequency of the learnt terms, both in the background space and in the in-domain data of interest.
Document type	Conference contribution
Language	English
Published at	https://doi.org/10.26615/978-954-452-056-4_016
Published at	https://aclanthology.org/R19-1016/
Downloads	Published version (Accepted author manuscript) R19-1016 (Final published version)
Permalink to this page

Back

UvA-DARE

Digital Academic Repository

Evaluating the Consistency of Word Embeddings from Small Data