Unsupervised Translation Sense Clustering
Title | Unsupervised Translation Sense Clustering |
Publication Type | Conference Paper |
Year of Publication | 2012 |
Authors | Bansal, M., DeNero J., & Lin D. |
Page(s) | 773-782 |
Other Numbers | 3399 |
Abstract | We propose an unsupervised method for clusteringthe translations of a word, such thatthe translations in each cluster share a commonsemantic sense. Words are assigned toclusters based on their usage distribution inlarge monolingual and parallel corpora usingthe softK-Means algorithm. In addition to describingour approach, we formalize the taskof translation sense clustering and describe aprocedure that leverages WordNet for evaluation.By comparing our induced clusters toreference clusters generated from WordNet,we demonstrate that our method effectivelyidentifies sense-based translation clusters andbenefits from both monolingual and parallelcorpora. Finally, we describe a method for annotatingclusters with usage examples. |
URL | https://www.icsi.berkeley.edu/pubs/ai/unsupervisedtranslation12.pdf |
Bibliographic Notes | Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL HLT 2010), Montreal, Canada, pp. 773-782 |
Abbreviated Authors | M. Bansal, J. DeNero, and D. Lin |
ICSI Research Group | AI |
ICSI Publication Type | Article in conference proceedings |