SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Description of the PRC CEO Algorithm for TREC-2
chapter
P. Thompson
National Institute of Standards and Technology
D. K. Harman
Bayesian formulation of the CEO problem
(Lindley 1983) a decision maker is interested
in some parameter or event for which he/she
has a prior, or initial, distribution or
probability. The decision maker revises the
distribution upon consulting several experts,
each with his/her own distribution or
probability for the parameter or event. To
effect this revision, the decision maker must
assess the relative expertise of the experts
and their interdependence, both with each
other and the decision maker. The experts'
distributions are considered as data by the
decision maker, which is used to update the
prior distribution.
For automatic document retrieval, the
retrieval system is the decision maker, and
different retrieval algorithms, or models, are
the experts (Thompson 1990a,b, 1991). This
is referred to as the upper level CEO. At
the lower level the probabilities of individual
features, e.g., terms, within a particular
retrieval model can be combined using CEO.
In lower level CEO the retrieval model is
the decision maker and the term probabilities
are viewed as lower level experts. The
probability distributions supplied by these
lower level experts can be updated,
according to Bayes theorem, by user
relevance judgments for retrieved documents.
These same relevance judgments also give
the system a way to evaluate the
performance of each model, both in the
context of a single search of several
iterations and over all searches to date.
These results can be used in a statistically
sound way to weight the contributions of the
models in the combined probability
distribution used to rank the retrieved
documents. Since various algorithms, such
as p-norm, are expressed in terms of
correlations rather than probability
distributions, it was necessary to extend the
CEO algorithm to handle correlations. So
far this extension has been handled in a
272
heuristic fashion. If a retrieval method, e.g.,
one of the cosine methods, returned a value
between 0 and 1 as a retrieval status value;
the logistic transformation of this weight was
interpreted as an estimate of the mean of a
logistically transformed beta distribution
which was provided as evidence to the
decision maker. Since there was no basis
with which to assign a standard deviation to
this distribution, as called for by the CEO
methodology, an assumption was made that
all standard deviations were .4045, a value
corresponding to a standard deviation of .1
in terms of probabilities. The CEO code
was written in g++.
For TREC- 1 we used the CEO algorithm to
combine all of the VPI&SU retrieval
methods except for the Boolean, i.e.,
weighted and unweighted cosine and inner
product measures as well as p-norm
measures of 1.0, 1.5, and 2.0. For measures,
such as the inner product and some of the
p-norm results not giving a retrieval status
value in the 0 to 1 range, the result was
mapped to this interval by scaling the
highest score of the method in question for
a given topic to the highest score given by
one of the cosine measures. Default scores
half way between 0 and the lowest score
achieved by a particular method were used
for documents not retrieved in the top 200 in
response to a given topic, since the actual
score of these documents was unknown. For
TREC-2 we followed the same approach
except that only the results of methods with
better TREC- 1 performance were combined.
Our fwst version used Cosine.atn and
Cosine.nnn, the two best VPI&SU methods
from TREC- 1, weighted by their
performance on TREC- 1. The second
version used these two methods and the next
best three, Inner.atn, Inner.nnn, and Pnorm
1.0, also weighted by their ThEC-1
performance (see VPI&SU report for details
on these methods). Figures 1 and 2 show