International Computer Science Institute Talks Talks at the International Computer Science Institute

"Lexical Ambiguity and Information Retrieval"

Hinrich Schuetze
Xerox Palo Alto Research Center
schuetze parc.xerox.com

Friday, November 6, 1998
ICSI, Rm 5A
3 - 4:30 pm

Abstract:

The problem of lexical ambiguity in information retrieval offers an opportunity for fruitful collaboration between linguistics, mathematical modeling and information retrieval research. In this talk, I will first take a fresh look at some linguistic questions about lexical ambiguity that are relevant in the context of information retrieval. How do we determine how many senses an ambiguous word has? Why is there so much inter-judge disagreement in disambiguation by humans? Is "co-activation", the simultaneous invocation of several senses, possible or even common? I will then sketch a probabilistic model of lexical ambiguity that addresses some of these questions. The parameters of the model are estimated using the EM algorithm. Finally, I will show how this model can be usefully employed in information retrieval with a significant improvement in performance and accuracy.

This talk will be held in Room 5A at ICSI.
1947 Center Street, Sixth Floor, Berkeley, CA 94704-1198
(on Center between Milvia and Martin Luther King Jr. Way)
Click here for a map