SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
CLARIT TREC Design, Experiments, and Results
chapter
D. Evans
R. Lefferts
G. Grefenstette
S. Handerson
W. Hersh
A. Archbold
National Institute of Standards and Technology
Donna K. Harman
extracted from a partition of the database. For the ad-hoc queries, the partition used for the
statistics was the same as the partition actually being queried. For the routing queries, however,
the final query vector was fixed before processing the new text (i.e., the second set of TREC
documents). In particular, in this case, the partition used to weight the routing-query vector
was extracted from the training corpus (the first set of TREC documents); this vector was then
queried against a partition extracted from the new, test corpus.
The NPs and their contained words among the documents in each partition were scored
for distribution and frequency; each NP/term- and word-type was given an IDF-TF score. As
noted above, for routing queries, the IDF-TF score was based on statistics from the original
partition of 2000 documents from the training corpus; it was a static query vector. For the
ad-hoc queries, on the other hand, the final partition of 2000 documents was used as the source
of statistics for the IDF-TF scoring. Therefore, the scores for terms in the query vector for
the ad-hoc queries could vary depending on the set of documents selected in the partitioning
process. Figure 21 gives a sample of a final query.
The terms in each topic's routing/partitioning thesaurus were given IDF-TF scores based
on the sample; original-query terms were added and the factors of those terms ("1", "2", or
"3") were used to multiply their IDF-TF-based scores; the combined terms and their contained
words thus formed an extended-query vector (the final query vector).
The 2000 documents for each topic were modeled in vector space (in which all terms and
their contained words formed the dimensions) and the final query vector was used to identify
and rank the 200 `best' documents, which constituted our results.
4.8 Summary of the Process
Figures 22 and 23 summarize the CLARIT-TREC processes described in detail in the preceed-
mg sections. As noted previously, there were only two steps in the CLARIT-TREC process
that required non-automatic processing: (1) initial review and weighting of the index terms
automatically-nominated and derived for the topic and (2) in the case of ad-hoc queries, review
of first-pass retrieved documents to identify 5-10 relevant ones for use in creating a pseud[OCRerr]
thesaurus for further processing.
5 Results and Evaluation
This section presents the CLARIT-TREC results in several forms, including broad overviews
of the performance, the "official" results tables, and tables of data that focus on statistics
that are especially relevant to the CLARIT-TREC approach. Results are presented with only
abbreviated explanations.16
As noted previously, the CLARIT team submitted both intermediate results ("A") and final
results ("B"). The intermediate results were generated by taking the highest-scoring 200 (out of
2000) documents as determined by the routing/partitioning process. Since the strategy of rout-
ing/partitioning was to nominate a moderately large candidate subset of documents in which
all the true relevants would be found and since the procedure and scoring were designed to over-
generate candidates, we expected to have many `false positives' in each set of 2000. We had no
reason to expect the relative ranking of these documents by their evoking routing/partitioning
scores would be a good measure of fit to the source topic. By contrast, we expected the final
steps (which utilize subset-specific term scoring and vector-space similarity measures) to induce
a relative ranking of documents that would represent a good fit to the source topic.
16More detailed analysis of the results is given in [Evans et al. in preparation].
271