Unsupervised Translation Sense Clustering

TitleUnsupervised Translation Sense Clustering
Publication TypeConference Paper
Year of Publication2012
AuthorsBansal, M., DeNero J., & Lin D.
Page(s)773-782
Other Numbers3399
Abstract

We propose an unsupervised method for clusteringthe translations of a word, such thatthe translations in each cluster share a commonsemantic sense. Words are assigned toclusters based on their usage distribution inlarge monolingual and parallel corpora usingthe softK-Means algorithm. In addition to describingour approach, we formalize the taskof translation sense clustering and describe aprocedure that leverages WordNet for evaluation.By comparing our induced clusters toreference clusters generated from WordNet,we demonstrate that our method effectivelyidentifies sense-based translation clusters andbenefits from both monolingual and parallelcorpora. Finally, we describe a method for annotatingclusters with usage examples.

URLhttps://www.icsi.berkeley.edu/pubs/ai/unsupervisedtranslation12.pdf
Bibliographic Notes

Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL HLT 2010), Montreal, Canada, pp. 773-782

Abbreviated Authors

M. Bansal, J. DeNero, and D. Lin

ICSI Research Group

AI

ICSI Publication Type

Article in conference proceedings