Discontinuous Parsing with an Efficient and Accurate DOP Model

Open Access
Authors
Publication date 2013
Host editors
  • H. Bunt
  • K. Sima'an
  • L. Huang
Book title Proceedings of The 13th International Conference on Parsing Technologies : IWPT-2013
Book subtitle November 27-29, 2013, Nara Japan
Event 13th International Conference on Parsing Technologies
Pages (from-to) 7-16
Publisher Association for Computational Linguistics
Organisations
  • Faculty of Science (FNWI)
  • Interfacultary Research - Institute for Logic, Language and Computation (ILLC)
Abstract
We present a discontinuous variant of tree-substitution grammar (tsg) based on Linear Context-Free Rewriting Systems. We use this formalism to instantiate a Data-Oriented Parsing model applied to discontinuous treebank parsing, and obtain a significant improvement over earlier results for this task. The model induces a tsg from the treebank by extracting fragments that occur at least twice. We give a direct comparison of a tree-substitution grammar implementation that implicitly represents all fragments from the treebank, versus one that explicitly operates with a significant subset. On the task of discontinuous parsing of German, the latter approach yields a 16 % relative error reduction, requiring only a third of the parsing time and grammar size. Fi-nally, we evaluate the model on several treebanks across three Germanic languages.
Document type Conference contribution
Language English
Published at https://pdfs.semanticscholar.org/c6be/75473c100967fd2291a3a08f80a16b5b7994.pdf https://www.aclweb.org/anthology/W13-5701/
Other links http://www.cs.cmu.edu/~sigparse/meetings.html
Downloads
9_pdfsam_IWPTproceedings (Final published version)
Permalink to this page
Back