SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Overview of the First Text REtrieval Conference (TREC-1)
chapter
D. Harman
National Institute of Standards and Technology
Donna K. Harman
2. The Task
2.1 Introduction
[OCRerr]IREC is designed to encourage research in information retrieval using large data collections. Two types of
retrieval are being [OCRerr]xarnined -- retrieval using an `tadhoc't query such as a researcher might use in a library
environment, and retrieval using a "routing" query such as a profile to filter some incoming document stream.
The TREC task is not tied to any given application, and is not concerned with interfaces or optimized response
time for searching. However it is helpful to have some potential user in mind when designing or testing a
retrieval system. The model for a user in TREC is a dedicated searcher, not a novice searcher, and the model
for the application is one needing monitoring of data streams for information on specific topics (routing), and the
ability to do adhoc searches on archived data for new topics. It should be assumed that the users need the abil-
ity to do both high precision and high recall searches, and are willing to look at many documents and repeatedly
modify queries in order to get high recall. Obviously they would like a system that makes this as easy as possi-
ble, but this ease should be reflected in ThEC as added intelligence in the system rather than as special inter-
faces.
Since TREC has been designed to evaluate system performance both in a routing (filtering or profiling) mode,
and in an adhoc mode, both functions need to be tested. The test design was based on traditional information
retrieval models, and evaluation used traditional recall and precision measures. The following diagram of the
test design shows the various components of TREC (fig. 1).
FThi[OCRerr]
Training
Topics
1½
Qi
Training
Queries
Q2
Routing
Queries
Test
Topics
Q3
Ad-hoc
Q eries
`U
1 Gigabyte 1 Gigabyte
Training Test
Documents Documents
(Dl) (D2)
Figure 1. The TREC Task.
This diagram reflects the four data sets (2 sets of topics and 2 sets of documents) that were provided to partici-
pants. These data sets (along with a set of sample relevance judgments for the 50 training topics) were used to
2