SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
A Boolean Approximation Method for Query Construction and Topic Assignment in TREC
chapter
P. Jacobs
G. Krupka
L. Rau
National Institute of Standards and Technology
Donna K. Harman
The problerns with undergeneration (and the related problem of not doing
a very good job of ranking the documents) were due to the fact that our sys-
tem was designed for routing, while TREC used traditional retrieval evaluation
methods, along with a 200-document cutoff, effectively counting recall on the
harder topics much more heavily than overall recall. Our approach can correct
for this by using a more flexible statistical method to expand the queries and
by performing a more sophisticated ranking (the document ranking as reported
was implemented post hoc in one line of Unix code).
More important than the problems to correct, there is an important result
here to build on. Our experience has been that pattern matching can be a
close approximation for this sort of task to natural language processing, so it
might seem that advanced methods are much more critical for finding what
to put in the queries than they are for the detailed analysis of the texts. The
general framework of this approach means that, with the continued development
of advanced methods for natural-language based corpus analysis, substantial
performance improvements can come within the context of almost any current
text retrieval systems.
6 Evaluation Methods
One unusual characteristic of our method is that it assumes that each relevance
judgement that the system makes is made independently of all other texts, as in
a routing task where the system processes each incoming message in turn and
assigns topics or actions for filing or routing that message. Certainly, this style
has certain advantages-it is simple, clear, and makes parallel processing easy-
and it reflects some real assumptions about the nature of the task. However,
although it seems to have done very well relative to other systems, it is not
consistent with the instructions for submitting results in TREC, and certainly
doesn't lead to the best possible showing on some of the results.
Topic 77, about poaching techniques, is one example of the different (naive,
perhaps) perspective toward evaluation that our system adopts. The query
specifies:
A relevant document will identify the type of wildlife being poached,
the poachzng technique or method of killing the wildlife which is used
by the poacher, and the reason for the poaching (e.g. for a trophy,
meat, or money).
This is a very specific query. Our test (bootstrapping) sample produced a good
number of hits, but most of them failed to include one of the required pieces of
information, usually the technique or method of killing. So, we narrowed the
query. The result is that, for this query, the system produced 9 total documents,
6 of which were judged relevant. This is high precision (.67), but it doesn't help
the overall results, since for this topic the precision at 200 documents is treated
306