"Lexical Ambiguity and Information Retrieval"
Hinrich Schuetze
Xerox Palo Alto Research Center
| schuetze | parc.xerox.com |
|---|
Friday, November 6, 1998
ICSI, Rm 5A
3 - 4:30 pm
The problem of lexical ambiguity in information retrieval offers an opportunity for fruitful collaboration between linguistics, mathematical modeling and information retrieval research. In this talk, I will first take a fresh look at some linguistic questions about lexical ambiguity that are relevant in the context of information retrieval. How do we determine how many senses an ambiguous word has? Why is there so much inter-judge disagreement in disambiguation by humans? Is "co-activation", the simultaneous invocation of several senses, possible or even common? I will then sketch a probabilistic model of lexical ambiguity that addresses some of these questions. The parameters of the model are estimated using the EM algorithm. Finally, I will show how this model can be usefully employed in information retrieval with a significant improvement in performance and accuracy.