SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Probabilistic Retrieval in the TIPSTER Collections: An Application of Staged Logistic Regression
chapter
W. Cooper
F. Grey
A. Chen
National Institute of Standards and Technology
Donna K. Harman
system must be designed for a collection for which no training data is available, and as a
practical matter none can be gathered. This objection has considerable force but it over-
looks the possibility of extrapolating the results of a regression analysis in one collection
for which training data exists into another for which it does not. As the experiment
developed, it cast some light on the question of whether such an extrapolation can be
effected without unacceptable loss of retrieval power.
A final objective of the SLR methodology is to produce estimates of relevance
probability that are reliable enough to present to the system users as part of the ranked
output they receive. Some IR research would appear to be premised on the notion that
the output ordering of the collection is all that matters -- that the only purpose of generat-
ing retrieval status values (`similarity coefficients', `ranking scores', etc.) is to achieve as
effective a ranking as possible. We agree that imposing an effective order of presentation
on the documents is the most essential single role of the retrieval status values, but feel
that in addition the numeric scores are themselves a potentially important part of the out-
put. Their significance lies in their ability to provide the user at each point in the search
with information about whether it is likely to be worth while to continue the search down
the ranking. Clearly, such scores will be most helpful if presented in a form that most
users find readily interpretable, and interpretable moreover in a sense that bears as
directly as possible on the decision of whether they should stop searching. Probability-
of-relevance estimates would appear to fit this prescription admirably.
For reasons that will become apparent, this final objective was not attained in the
present experiment. However, experiments in small collections indicate that SLR is
capable of producing well-calibrated probability estimates, and doing so remains one of
the general objectives of the methodology
The SLR Methodology
The theoretical foundations of the SLR approach are presented in a recent paper by
Cooper, Dabney, & Gey (1992). A synthesis and extension of earlier approaches to prob-
abilistic retrieval, the SLR method combines the commonplace theoretical stratagem of
invoking statistical simplifying assumptions with the empirical technique of applying sta-
tistical regression analysis to a learning sample. The use of statistical simplifying
assumptions in IR has been explored by Maron & Kuhns (1960), Robertson & Sparck
Jones (1976), Yu & Salton (1976), van Rijsbergen (1979) and others (surveyed by Maron
(1984), Bookstein (1985)). Examples of the use of regression analysis are to be found for
example in the work of Fox (1983), Fuhr (1989), and Fuhr & Buckley (1991).
A distinguishing characteristic of SLR is that it breaks the analysis of the retrieval
process down into two or more distinct steps or stages. For the present experiment a sim-
ple two-stage procedure was adopted. In the first stage a learning sample was used to
develop a regression equation that combines elementary retrieval clues into composite
clues. In the second stage, the same empirical data is used to derive another regression
equation that combines these composite clues into an estimate of the desired estimate of
relevance probability for each query-document pair. Thus the evidence bearing on the
retrieval decision is organized first into sets of simple properties of particular descriptors,
and then into combinations of such sets as determined by the particular descriptors com-
mon to the query and document under consideration.
75