Learning Structural Dependencies of Words in the Zipfian Tail
| Authors | |
|---|---|
| Publication date | 2011 |
| Host editors |
|
| Book title | Proceedings of the 12th International Conference on Parsing Technologies |
| Book subtitle | IWPT 2011 : October 5-7, 2011, Dublin City University |
| ISBN |
|
| Event | IWPT 2011 |
| Pages (from-to) | 80-91 |
| Publisher | New Brunswick, NJ: Association for Computational Linguistics |
| Organisations |
|
| Abstract |
Using semi-supervised EM, we learn finegrained but sparse lexical parameters of a generative parsing model (a PCFG) initially estimated over the Penn Treebank. Our lexical parameters employ supertags, which encode complex structural information at the pre-terminal level, and are particularly sparse in labeled data - our goal is to learn these for words that are unseen or rare in the labeled data. In order to guide estimation from unlabeled data, we incorporate both structural and lexical priors from the labeled data. We get a large error reduction in parsing ambiguous structures associated with unseen verbs, the most important case of learning lexico-structural dependencies. We also obtain a statistically significant improvement in labeled bracketing score of the treebank PCFG, the first successful improvement via semi-supervised EM of a generative structured model already trained over large labeled data.
|
| Document type | Conference contribution |
| Language | English |
| Published at | http://www.aclweb.org/anthology/W/W11/W11-2911.pdf |
| Other links | http://www.aclweb.org/anthology/sigparse.html#2011_1 |
| Permalink to this page | |