SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Okapi at TREC
chapter
S. Robertson
S. Walker
M. Hancock-Beaulieu
A. Gull
M. Lau
National Institute of Standards and Technology
Donna K. Harman
feedback in citym2).
In accordance with the general philosophy of
Okapi, the TREC experiments were used to test
the use of simple statistical techniques, with
minimal linguistic processing, minimal searcher
knowledge of techniques of searching, and indeed
minimal effort generally. The routing test was
intended to address mainly the value of relevance
feedback (term selection and weighting) in a
routing context where relevance judgements are
accumulated from earlier runs. Automatic ad-hoc
queries tested the weighting scheme without
relevance information. Manual ad-hoc queries
tested the combination of human intelligence with
a simple weighting scheme, with and without
feedback.
5.1 Automatic processing of topics
The basic principle was to take specific section(s)
of the topic and parse them in standard Okapi
fashion, as if they had been typed in verbatim by a
searcher. Thus stopwords were removed; a few
phrases and/or members of synonym classes were
identified; remaining words were stemmed; all
search terms (stems or phrases or synonym
classes) were weighted using IDF (see section
2.1). No special account was taken of the
negative phrases which appear in some of the
TREC topics, so that negated words would have
been given positive weights by Okapi.
The selection of the topic sections was the subject
of a very small amount of initial experimentation
using the training set. The differences were not
very consistent and in some cases small, and more
testing would have been useful. However,
marginally the best overall was Concepts only,
and that was what we used for the returned
results.
The results for the above automatic analysis of
ad-hoc queries are given in the official tables as
cityal.
5.2 Routing queries
The principle on the routing queries was to
assume that all the known relevant documents
from the first document set were already available
for a relevance feedback process. Thus any actual
searches conducted on the first document set, and
their actual outputs, played no direct part in the
25
formulation of the routing queries, with one
exception discussed below. However, the terms
extracted from the topics took part in the
relevance feedback process in the manner
indicated in section 3.4, with a bias equivalent to
10 supposed relevant documents in all of which
the topic terms were supposed to occur (10 out of
10 bias).
The exception to the above statement was that for
some topics, some additional relevance
assessments were made (that is, additional to
those provided centrally). These were based on
the top ranked documents retrieved in automatic
searches on the first document set. (See section
6.1 for a discussion on the local relevance
judgements and on the reasons for this decision.)
The results for the above analysis of routing
queries are given in the official tables as cityri.
5.3 Manual searching and feedback
The central idea behind these experiments was to
approach as closely as possible the situation of a
naive or inexperienced user. In other words, we
wanted to gain some idea of how the system
would perform if searched by an end-user with
little or no knowledge of information retrieval.
This intention reflects the design principles of the
interactive Okapi, as discussed in section 2 above.
To some degree, however, both the design of the
TREC experiment in general and the constraints
of the distributed system described in section 3.1
forced deviations from that ideal.
5.3.1 Searchers
The first constraint is of course that we had no
access to end-users (and more particularly, no
access to end-users with the specific
characteristics of the TREC analysts). We used a
panel of searchers, mainly information science
students who could be said to have some
knowledge of searching in general, limited
domain knowledge (depending on the topic), and
no particular knowledge of the system. [OCRerr]or
reasons to do with the very limited time available
for these searches, it was necessary to use project
staff for a few searches; these staff obviously had
more knowledge of the system.) The somewhat
limited interface to the BSS which was used for
this experiment required some training of the
searchers.