SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Automatic Routing and Ad-hoc Retrieval Using SMART: TREC 2
chapter
C. Buckley
J. Allan
G. Salton
National Institute of Standards and Technology
D. K. Harman
Run R-prec Total Rel recall-prec
crnlV2 3640 8018 3163
crnlV2-b, 4053 8256 3512
crnlV2-b, (no not's 4061 8254 3560
crnlL2 3641 8224 3258
crnlL2-b 3922 8379 3538
sentence restricted 3960 8252 3477
Table 2: Ad-hoc results
of the query against the paragraphs of the can-
didate document (III.a.1 from Table 3), with
the query terms weighted 1 iff present, and
the document terms weighted using formula 1
above (that used by the query in the global
similarity).
We then use the global/local values in a se-
ries of retrieval runs using the same queries
but against the entire TREC 1 document set
(D12). We tried a range of [OCRerr] and [OCRerr] values and
use the best values for the official run, crnlL2.
The formula used for crnlL2 is:
sim = 100. global + 16. local
where "global" is the query/document similar-
ity described above ("ltc-lnc"), and "local" is
the top query/paragraph similarity.
It takes roughly 5 hours clock time to de-
termine the suggested weighting coefficients,
though multiple combinations of values could
be weighted simultaneously-in one case, we
calculated each of the 48 possible local vari-
ables simultaneously. Each of the retrospec-
tive runs takes from 60 to 90 minutes to run,
depending on its complexity. These runs take
an unusually large amount of time (compared
to crnlV2) since they require re-indexing from
scratch a large number of documents. The ba-
sic procedure is to discover the top 1750 doc-
uments for each query using the global sim-
ilarity. Then each of those documents is re-
indexed, breaking it down into its component
parts (e.g., paragraphs). Then each compo-
nent part is compared against the query to
obtain local similarities.
Other Experiments
The Smart indexing procedures that are used
in our experiments do not analyze the docu-
ments or queries for negative terms such as
49
not. A query which explicitly requests doc-
uments "not about the United Kingdom or
Canada" will actually match any document
with those terms. Removing the negative key-
words results in insignificant improvement: 16
queries are helped, 16 are hurt, all in only a mi-
nor fashion. These results suggest that other
terms in the query were more important fQr
locating the relevant documents.
Earlier experiments with an on-line ency-
clopedia ([14, 16]) demonstrated that preci-
sion can be improved by discarding docu-
ments which fail a local context check (cf. [1]
where such documents were merely given lower
similarity measures). That approach on the
TREC 2 queries and collection yields almost
exactly the same performance as crnlV2-b
(see "sentence restricted" in Table 2). [1] dis-
cusses probable reasons for the limited success
of this method.
Analysis of Ad-hoe Results
The results of Table 2 suggest that there is
little advantage to using local values in combi-
nation with global matches. From run crulV2
to run crulL2 there is negligible improvement,
crnlL2 does retrieve an additional 120-200 rel-
evant documents.
The retrospective runs using the Wall Street
Journal sub-collection suggested there would
be greater improvement between crnlV2 and
crnlL2 than actually occurred. The most ob-
vious problem is that the definition of a para-
graph is sub-collection dependent. Our results
were tailored to the WSJ sub-collection and
probably did not apply well to the other sub-
collections where "paragraphs" might be ex-
tremely large.