SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
TREC-2 Routing and Ad-Hoc Retrieval Evaluation using the INQUERY System
chapter
W. Croft
J. Callan
J. Broglio
National Institute of Standards and Technology
D. K. Harman
4 The TREC Experiments
Four experiments were submitted to the TREC evaluation, two "ad-hoc" and two "routing".
In these experiments, we emphasized automatic query processing and automatic feedback
algorithms for routing. The following is a summary:
. AdHoc: topics 101-150 against TIPSTER volumes 1 and 2.
INQOOl Created automatically from TIP STER topics. Contains phrases. Details of
query processing used are described below.
INQOO2 INQOOl queries, modified manually. Modifications restricted to eliminating
words and phrases, and adding paragraph-level operators around existing words
and phrases. The method for doing this was done somewhat differently than last
year's TREC conference, as discussed below.
. Routing: topics 51-100 against TIPSTER volume 3.
INQOO3 Created automatically from TIPSTER topics and relevance judgements
from Volumes 1 and 2. Baseline queries (from a previous TIPSTER evalua-
tion) were modified by reweighting and adding single-word terms. The term
weighting and selection function used was df.idf, as described in [5]. Only the
top 120 relevant documents found by INQUERY were used for feedback, and 30
terms were added to each query.
INQOO4 Formed by combining (using the #SUM operator) INQOOl queries and IN-
QRYP queries (used in TIPSTER 18 month evaluation). The INQRYP queries
were produced automatically and then modified manually. Modifications re-
stricted to eliminating words and phrases, and adding paragraph-level operators
around existing words and phrases.
Query Type
INQOOl
1NQ002
5 Docs
.62
.60 (-2.6%)
Average Precision
30 Docs 100 Docs
.57 .49
.59 (+3.5%) .51 (+4.1%)
11-Pt Avg
.36
.36 (0%)
Table 1: Results for Adhoc queries
Table 1 gives the results for the adhoc queries. These show that there is little difference
in effectiveness between the automatically processed queries and the semi-automatically
processed queries. The query processing for the automatically processed queries has been
significantly improved as described in the previous section, but there is another effect.
Compared to the manual query run in the last TREC conference, paragraph-level concepts
were formed in a much more mechanistic way and were constrained by the language of the
description and the narrative. In the previous conference, the only constraint was the vo-
cabulary used in the queries, and the user's "world knowledge" was used to group concepts.
80