SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Overview of the Second Text REtrieval Conference (TREC-2)
chapter
D. Harman
National Institute of Standards and Technology
D. K. Harman
Training
Topics
L~mi1OO)\
mTest
Topics
LrFj;50)
There were guidelines for constructing and manipulating
the system data structures. These structures were defined
to consist of the original documents, any new structures
built automatically from the documents (such as inverted
files, thesauri, conceptual networks, etc.), and any new
structures built manually frorn the documents (such as
thesauri, synonym lists, knowledge bases, rules, [OCRerr]tc.).
The following guidelines were developed for the TREC
task.
Qi Q2
[11[OCRerr]Th1ning[OCRerr]Th[OCRerr]est
Documents
and 2) (Disk 3)
Figure 1. The II[OCRerr]EC Task.
The test design was based on traditional information
retrieval models, and evaluation used traditional recall and
precision measures. The above diagram of the test design
shows the various components of IREC (fig. 1).
This diagram reflects the four data sets (2 sets of topics
and 2 sets of documents) that were provided to partici-
pants. These data sets (along with a set of sample rele-
vance judgrnents for the 100 training topics) were used to
construct three sets of queries. Qi is the set of queries
(probably multiple sets) created to help in adjusting a sys-
tem to this task, to create better weighting algorithms, and
in general to train the system for testing. The results of
this research were used to create Q2, the routing queries
to be used against the test documents. Q3 is the set of
queries created from the test topics as adhoc queries for
searching against the training documents. The results
from searches using Q2 and Q3 were the official test
results sent to NIST.
2.1 Specific Task Guidelines
Because the ThEC participants used a wide variety of
indexin[OCRerr][OCRerr]owledge base building techniques, and a wide
variety of approaches to generate search queries, it was
important to establish clear guidelines for the evaluation
task. The guidelines deal with the methods of index-
mg/knowledge base construction, and with the methods of
generating the queries from the supplied topics. In gen-
eral, they were constructed to reflect an actual operational
environment, and to allow as fair as possible a separation
among the diverse query construction approaches.
2
1. System data structures should be built using the
initial training set (documents from disks 1 and 2,
training topics 1-100, and the relevance judg-
ments). They may be modified based on the test
documents from disk 3, but not based on the test
topics.
2. There are parts of the test collection, such as the
Wall Street Journal and the Ziff material, that con-
tain manually assigned controlled or uncontrolled
index terms. These fields are delimited by SGML
tags, as specified in the documentation files
included with the data. Since the primary focus is
on retrieval and routing of naturally occurring text,
these manually indexed terms should not be used.
3. Special care should be used in handiing the rout-
ing task. `[`a true routing situation, a single docu-
ment would be indexed and compared against the
routing topics. Since the test documents are gen-
erally indexed as a complete set, routing should be
simulated by not using any information based on
the full set of test documents (such as weighting
based on the test collection, total frequency based
on the test collection, etc.) in the searching. It is
permissible to use training-set collection informa-
tion however.
Additionally there were guidelines for constructing the
queries from the provided topics. These guidelines were
considered of great importance for fair system compari-
son and were therefore carefully constructed. Three
generic categories were defined, based on the amount and
kind of manual intervention used.
1. AUTOMATIC (completely automatic iuitial query
construction)
adhoc queries -- The system will automatically
extract information from the topic to construct the
query. The query will then be submitted to the sys-
tem (with no manual modifications) and the results
from the system will be the results submitted to
NIST. There should be no manual intervention
that would affect the results.
routing queries -- The queries should be