SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Feedback and Mixing Experiments with MatchPlus
chapter
S. Gallant
W. Caid
J. Carleton
T. Gutschow
R. Hecht-Nielsen
K. Qing
D. Sudbeck
National Institute of Standards and Technology
D. K. Harman
a fully automated query run (as with the first ad hoc
submission), and a top 20 feedback run (as with the
second ad hoc submission). For each topic, the best
method was estimated from retrievals on a separate
test corpus (not the corpus used for submissions).
The winning method's list of 1000 retrievals was then
selected as the retrievals for this topic.
3.3.4 Routing Run #2: Mix and Match
Our second routing run was to mix retrievals from the
same 4 sources, using a `quality estimate' consisting
of 11-point recall/precision scores determined from a
run on a separate test corpus. Each document in each
of the four approaches was given points proportional
to the run quality estimate and inversely proportional
to its position number on an output list. Documents
appearing on more than one list received points for
each appearance.
Mix and Match worked better than the previous
Best Candidate approach.
4 Comments
We are generally pleased with the performance from
our one-year-old system's results. (For final figures,
see the appendix of this proceedings.)
In examining the data, one interesting aspect is
that Ma[OCRerr]chPlus does better when measured by 11-
point averages than by number of relevant documents
retrieved. This means a comparatively higher per-
centage of documents in early re[OCRerr]rievals were judged
relevant .4
We are now running initial experiments with word
sense disambiguation using context vectors and clus-
tering. It will be interesting to see whether word sense
disambiguation can further improve retrieval perfor-
mance.
References
[1] Gallant, S. I. Context Vector Representa-
tions for Document Retrieval. AAAI-91 Natu-
ral Language Text Retrieval Workshop, Ana-
heim, CA, July 15,1991.
Networks. Neural Computation, Vol.3, No.
3, 1991, 293-309.
[3) Gallant, S. I. Perceptron-Based Learning Al-
gorithms. IEEE Transactions on Neural Net-
works, Volume 1, Number 2.
[4] Gallant, S. I. Neural Network Learning and
Expert Systems. MIT Press, 1993.
[5) Hinton, G. E. Distributed Representations.
Technical Report CMU-CS-84-157, Carnegie-
Mellon University, Department of Computer
Science. Revised version in Rumelhart, D.
E. & McClelland, J. L. (Eds.) Parallel Dis-
tributed Processing: Explorations in the Mi-
crostructures of Cognition, Vol.1. MIT Press,
1986.
[6] Salton, G. The SMART retrieval system - Ex-
periments in automatic automatic document
processing. Englewood Cliffs, NJ: Prentice-
Hall, 1971.
[7] Salton, G. & Buckley, C. Term-Weighting A[OCRerr]
proaches in Automatic Text Retrieval. Infor-
mation Processing & Management, Vol.24,
No.5,1988, pp. 513-523.
[8] Salton, G. & Buckley, C. Improving Retrieval
Performance by Relevance Feedback. Journal
of the American Society for Information Sci-
ence, 41(4):288-297, 1990.
[9] Wong, S.K.M, Yao, Y.Y, Salton, G. & Buck-
ley, C. Evaluation of an Adaptive Linear
Model. Journal of the American Society for
Information Science, 42(10):723-730, 1991.
[10] Waltz, D. L. & Pollack, J. B. Massively Par-
allel Parsing: A Strongly Interactive Model
of Natural Language Interpretation. Cogni-
tive Science 9, 51-74 (1985).
[2] Gallant, S.
Representing
Word Sense
I. A Practical Approach for
Context And for Performing
Disambiguation Using Neural
4T1'is also raises the question as to whether comparatively
fewer total documents were scored by the readers, perhaps due
to the AfaichPlus system's different approach. Computing the
median number of documents scored per topic would resolve
this question, as weU as providing an indication of "noveltv"
for TREC participants.
104