Statistical machine translation with local language models

Authors	C. Monz
Publication date	2011
Book title	EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing
Event	EMNLP '11, Conference on Empirical Methods in Natural Language Processing
Pages (from-to)	869-879
Publisher	Association for Computational Linguistics
Organisations	Faculty of Science (FNWI) - Informatics Institute (IVI)
Abstract	Part-of-speech language modeling is commonly used as a component in statistical machine translation systems, but there is mixed evidence that its usage leads to significant improvements. We argue that its limited effectiveness is due to the lack of lexicalization. We introduce a new approach that builds a separate local language model for each word and part-of-speech pair. The resulting models lead to more context-sensitive probability distributions and we also exploit the fact that different local models are used to estimate the language model probability of each word during decoding. Our approach is evaluated for Arabic- and Chinese-to-English translation. We show that it leads to statistically significant improvements for multiple test sets and also across different genres, when compared against a competitive baseline and a system using a part-of-speech model.
Document type	Conference contribution
Language	English
Published at	http://delivery.acm.org/10.1145/2150000/2145528/p869-monz.pdf?ip=145.18.109.227&acc=OPEN&CFID=196570483&CFTOKEN=48575281&__acm__=1352451699_dacf34ffc5e528639882c3c57cd14e61
Permalink to this page

Back

UvA-DARE