Statistical machine translation with local language models
| Authors | |
|---|---|
| Publication date | 2011 |
| Book title | EMNLP '11 Proceedings of the Conference on Empirical Methods in Natural Language Processing |
| Event | EMNLP '11, Conference on Empirical Methods in Natural Language Processing |
| Pages (from-to) | 869-879 |
| Publisher | Association for Computational Linguistics |
| Organisations |
|
| Abstract |
Part-of-speech language modeling is commonly used as a component in statistical machine translation systems, but there is mixed evidence that its usage leads to significant improvements. We argue that its limited effectiveness is due to the lack of lexicalization. We introduce a new approach that builds a separate local language model for each word and part-of-speech pair. The resulting models lead to more context-sensitive probability distributions and we also exploit the fact that different local models are used to estimate the language model probability of each word during decoding. Our approach is evaluated for Arabic- and Chinese-to-English translation. We show that it leads to statistically significant improvements for multiple test sets and also across different genres, when compared against a competitive baseline and a system using a part-of-speech model.
|
| Document type | Conference contribution |
| Language | English |
| Published at | http://delivery.acm.org/10.1145/2150000/2145528/p869-monz.pdf?ip=145.18.109.227&acc=OPEN&CFID=196570483&CFTOKEN=48575281&__acm__=1352451699_dacf34ffc5e528639882c3c57cd14e61 |
| Permalink to this page | |