SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Automatic Routing and Ad-hoc Retrieval Using SMART: TREC 2
chapter
C. Buckley
J. Allan
G. Salton
National Institute of Standards and Technology
D. K. Harman
x
Y
A
B
C
P
number of singi'e terms to add (possible values 0 to 500)
number of phrases to add (0 to 100)
relative importance of original query (fixed at 8)
relative importance of average weight in relevant documents (4 to 48)
relative importance of average weight in non-relevant documents (0 to 16)
relative importance of phrases in final retrieval as compared to single terms (0, 0.5, or 1.0)
Table 4: Parameters of routing
Just re-weighting the query terms according
to Rocchio's algorithm gives a 7% improve-
ment. Adding a few terms (20 single terms
+ 10 phrases) gives 17% improvement over
the base case, and expanded by 350 (300+50)
terms results in a 38% improvement.
The official run crnlCi is actually a bit dis-
appointing. It only results in a 3% improve-
ment over the crnlRl run, which is not very
significant considering the effort required. Few
people are going to keep track of 158 test runs
on a per query basis. It may be practical to
keep track of 4 or so main query variants, but
then the improvement would probably be less
than 3%. We are conducting experiments in
this area currently.
An open question is the effectiveness of
varying the feedback approach itself between
queries. Preliminary experiments using Fuhr's
RPI ([3]) weighting schemes in addition to the
Rocchio variants show larger improvements.
In general, RPI (and the other probabilistic
models) perform noticeably better than Roc-
chio if there is very little query expansion,
though quite a bit worse under massive ex-
pansion. We expect that the combination of
RPI for those queries with little expansion and
Rocchio for other queries will work well.
One benefit of the CrR1C1 run not entirely
represented by the evaluation figures is that
retrieval performance is more even. Poten-
tial mismatches between feedback method and
query are far less likely. crulCi does reason-
ably on all the queries (above the median sys-
tem for every query when compared against
the other systems).
Routing Implementation and
Timing
The original routing queries are automatically
indexed from the query text, and weighted us-
52
mg the `[OCRerr]ltc" weighting scheme (equation (1)).
Collection frequency information used for the
idf factors is gathered from D12 documents
only. Relevance information about potential
query terms is gathered and stored on a per
query basis. For each query, statistics (includ-
ing relevant and non-relevant frequency and
total "ltc" weights) are kept about the 1000
most frequently occurring terms in the D12
relevant documents. For TREC 2, this is done
by a batch run taking about 90 CPU minutes.
In practice, this would be done incrementally
as each document was compared to the query
and judged. The statistics amounted to about
40,000 bytes per query.
Using these statistics, and the decided upon
parameters for the feedback process (A, B,
etc.), actual construction of the final query
takes about 0.5 seconds per query.
Retrieval times vary tremendously with
length of query. We ran in batch mode, con-
structing an inverted file for the entire D3
document set ("lnc" document weights) and
then comparing a query against that inverted
file. Not only is this not what would be done
in practice, but it is much less efficient than
would be done in practice given our massive
expansion of queries: for each query in cruiRi,
well over half the entire inverted file was read!
CPU time per query ranged from about 5 sec-
onds (no expansion) to 65 seconds (expansion
by 500 terms).
Conclusion
No firm conclusions can be reached regarding
the usefulness of combining local and global
similarities in the TREC environment. In
some limited circumstances minor improve-
ments can be obtained, but in general we have
not (yet!) been able to take advantage of the
local information we know should be useful.