SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Feedback and Mixing Experiments with MatchPlus chapter S. Gallant W. Caid J. Carleton T. Gutschow R. Hecht-Nielsen K. Qing D. Sudbeck National Institute of Standards and Technology D. K. Harman a fully automated query run (as with the first ad hoc submission), and a top 20 feedback run (as with the second ad hoc submission). For each topic, the best method was estimated from retrievals on a separate test corpus (not the corpus used for submissions). The winning method's list of 1000 retrievals was then selected as the retrievals for this topic. 3.3.4 Routing Run #2: Mix and Match Our second routing run was to mix retrievals from the same 4 sources, using a `quality estimate' consisting of 11-point recall/precision scores determined from a run on a separate test corpus. Each document in each of the four approaches was given points proportional to the run quality estimate and inversely proportional to its position number on an output list. Documents appearing on more than one list received points for each appearance. Mix and Match worked better than the previous Best Candidate approach. 4 Comments We are generally pleased with the performance from our one-year-old system's results. (For final figures, see the appendix of this proceedings.) In examining the data, one interesting aspect is that Ma[OCRerr]chPlus does better when measured by 11- point averages than by number of relevant documents retrieved. This means a comparatively higher per- centage of documents in early re[OCRerr]rievals were judged relevant .4 We are now running initial experiments with word sense disambiguation using context vectors and clus- tering. It will be interesting to see whether word sense disambiguation can further improve retrieval perfor- mance. References [1] Gallant, S. I. Context Vector Representa- tions for Document Retrieval. AAAI-91 Natu- ral Language Text Retrieval Workshop, Ana- heim, CA, July 15,1991. Networks. Neural Computation, Vol.3, No. 3, 1991, 293-309. [3) Gallant, S. I. Perceptron-Based Learning Al- gorithms. IEEE Transactions on Neural Net- works, Volume 1, Number 2. [4] Gallant, S. I. Neural Network Learning and Expert Systems. MIT Press, 1993. [5) Hinton, G. E. Distributed Representations. Technical Report CMU-CS-84-157, Carnegie- Mellon University, Department of Computer Science. Revised version in Rumelhart, D. E. & McClelland, J. L. (Eds.) Parallel Dis- tributed Processing: Explorations in the Mi- crostructures of Cognition, Vol.1. MIT Press, 1986. [6] Salton, G. The SMART retrieval system - Ex- periments in automatic automatic document processing. Englewood Cliffs, NJ: Prentice- Hall, 1971. [7] Salton, G. & Buckley, C. Term-Weighting A[OCRerr] proaches in Automatic Text Retrieval. Infor- mation Processing & Management, Vol.24, No.5,1988, pp. 513-523. [8] Salton, G. & Buckley, C. Improving Retrieval Performance by Relevance Feedback. Journal of the American Society for Information Sci- ence, 41(4):288-297, 1990. [9] Wong, S.K.M, Yao, Y.Y, Salton, G. & Buck- ley, C. Evaluation of an Adaptive Linear Model. Journal of the American Society for Information Science, 42(10):723-730, 1991. [10] Waltz, D. L. & Pollack, J. B. Massively Par- allel Parsing: A Strongly Interactive Model of Natural Language Interpretation. Cogni- tive Science 9, 51-74 (1985). [2] Gallant, S. Representing Word Sense I. A Practical Approach for Context And for Performing Disambiguation Using Neural 4T1'is also raises the question as to whether comparatively fewer total documents were scored by the readers, perhaps due to the AfaichPlus system's different approach. Computing the median number of documents scored per topic would resolve this question, as weU as providing an indication of "noveltv" for TREC participants. 104