SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Combining Evidence for Information Retrieval
chapter
N. Belkin
P. Kantor
C. Cool
R. Quatrain
National Institute of Standards and Technology
D. K. Harman
5. Conclusions ARPA.
In general, we conclude that our initial research ques-
tions with respect to query combination have been posi-
tively answered. That is, if one has available several differ-
ent representations of a single information problem, then it
makes sense to use all of them, in combination, in order to
improve retrieval performance, rather than to try to identify
and use only the best one. In addition, it is reasonably clear
that progressive and continuous combination of query for-
mulations leads to continuing and progressive improvement
of performance. This may extend to progressive modifica-
tion of query formulations in the routing situation, for in-
stance, on the basis of each iteration of retrieval. Neverthe-
less, some of our results appear anomalous, and in particu-
lar we need to address more carefully the issue of how best
to combine query formulations.
As far as our data fusion questions are concerned, we
have clearly demonstrated that doing data fusion is better
than using only one query formulation. Although perfor-
mance improvement in these experiments was rather low,
for operational settings in which there are multiple systems
with incompatible scores, a data fusion method that works
with the ranked outputs, rather than the scores is the precise
method that is needed. In the present study we have shown
how that method can be extended from the case of binary
(set) retrieval to the case of ranked lists. We have shown
that the results are, on the average, better than the results of
the individual formulations. In some cases, they are better
than the best of the component formulations. This lends
support to a program of seeking optimal tunings for fusion
of any number of given systems, to achieve results better
than any of them alone could provide.
Overall, we find strong support for adaptive weighting
in query combination. This is applicable to both routing,
as shown direcdy here, and to relevance feedback, which we
have simulated in our application to the ad hoc topics. We
also find strong support for enlarging the set of query repre-
sentations. This success raises many interesting possibili-
ties. For example, one might systematically explore the k-
way combinations to see how they compare to the adaptive
weighting scheme. Or, one might apply the notion of
adaptive weighting to the best of the k-way combinations.
The possibilities for combining these two concepts ex-
plodes (of course) combinatorially. We feel that the present
experiments point a way into the forest of possibilities.
6. Acknowledgments
We wish to thank the 75 searchers who so generously
donated their time and effort to this projecL Without them
this research could not have been done. We also wish to
thank Audrey Gorman and Kathy Mrowka, who provided
invaluable assistance on the project in planning, data gath-
ering and input, and Dong Li, who helped with the data
analysis. We owe special thanks to Bruce Croft and Jamie
Callan, not only for permission to use the INQUERY sys-
tem for this investigation, but also for the unstinting sup-
port they gave us in using it. This research was performed
with partial funding from a TREC support grant from the
7. References
BELKIN, N.J., COOL, C., CROFT, W.B. & CALLAN,
J.P. (1993). The effect of multiple query representations on
information retrieval performance. In: Proceedings of the
16th International Conference on Research and Develop-
ment in Information Retrieval (SIGIR `93), Pittsburgh,
1993. New York, ACM: 339-346.
BELKIN, N.J. & CROFT, W.B. (1992) Information filter-
ing and information retrieval: Two sides of the same coin?
Communications of the ACM, 35,12: 29-38.
BELKIN, N.J, ODDY, R.N. & BROOKS, H.M. (1982)
ASK for information retrieval. Journal ofDocumentation,
38, 2&3: 61-71, 145-164.
FOX, E.A. et al. (1993) Combining evidence from multi-
ple searches. In: D.K Ilarman, ed., The First Text REtrieval
Conference (TREC-1). GPO, Washington, D.C.: 319-328.
KANTOR, P. (1993) Vector space models of data combina-
tion in information retrieval. Technical Report APlabfrR-
93-3. New Brunswick, NJ., Rutgers University, School of
Communication, Information & Library Studies.
MCGILL, M., KOLL, M. & NORREAULT, T. (1979)
An evaluation of factors affecting document ranking by in-
formation retrieval systems. Syracuse, Syracuse University
School of Information Studies.
SARACEVIC, T. & KANTOR, P. (1988) A study of in-
formation seeking and retrieving. III. Searchers, searches,
overlap. Journal of the ASIS, 39,3:197-216.
TAYLOR, R.S. (1968) Question negotiation and informa-
tion seeking in libraries. College and Research Libraries,
29:178-194.
TURThE, H. & CROFT, W.B. (1991) Evaluation of an
inference network-based retrieval model. A C M
Transactions on Information Systems, 9,3:187-222.
VAN RIJSBERGEN, CJ. (1986) A new theoretical frame-
work of information retrieval. In: Proceedings of the 1986
International Conference on Research and Development in
Information Retrieval (SIGIR `86), Pisa, 1986. New York,
ACM: 194-200.
43