Gappy Phrasal Alignment by Agreement

Publication TypeConference Paper
Year of Publication2011
AuthorsBansal, M., Quirk C., & Moore R. C.
Other Numbers3269

We propose a principled and efficient phraseto-phrase alignment model, useful in machinetranslation as well as other related natural languageprocessing problems. In a hidden semi-Markov model, word-to-phrase and phraseto-word translations are modeled directly bythe system. Agreement between two directionalmodels encourages the selection of parsimoniousphrasal alignments, avoiding theoverfitting commonly encountered in unsupervisedtraining with multi-word units. Expandingthe state space to include “gappyphrases” (such as French ne pas) makes thealignment space more symmetric; thus, it allowsagreement between discontinuous alignments.The resulting system shows substantialimprovements in both alignment quality andtranslation quality over word-based HiddenMarkov Models, while maintaining asymptoticallyequivalent runtime.


This work was partially supported by funding provided to ICSI by a gift from Microsoft Research.

Proceedings of the 49th annual Meeting of the Association for Computational Linguistics (ACL HLT 2011), pp. 1308-1317 Portland, Oregon

Article in conference proceedings