NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)

SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Automatic Routing and Ad-hoc Retrieval Using SMART: TREC 2 chapter C. Buckley J. Allan G. Salton National Institute of Standards and Technology D. K. Harman Run R-prec Total Rel recall-prec crnlV2 3640 8018 3163 crnlV2-b, 4053 8256 3512 crnlV2-b, (no not's 4061 8254 3560 crnlL2 3641 8224 3258 crnlL2-b 3922 8379 3538 sentence restricted 3960 8252 3477 Table 2: Ad-hoc results of the query against the paragraphs of the can- didate document (III.a.1 from Table 3), with the query terms weighted 1 iff present, and the document terms weighted using formula 1 above (that used by the query in the global similarity). We then use the global/local values in a se- ries of retrieval runs using the same queries but against the entire TREC 1 document set (D12). We tried a range of [OCRerr] and [OCRerr] values and use the best values for the official run, crnlL2. The formula used for crnlL2 is: sim = 100. global + 16. local where "global" is the query/document similar- ity described above ("ltc-lnc"), and "local" is the top query/paragraph similarity. It takes roughly 5 hours clock time to de- termine the suggested weighting coefficients, though multiple combinations of values could be weighted simultaneously-in one case, we calculated each of the 48 possible local vari- ables simultaneously. Each of the retrospec- tive runs takes from 60 to 90 minutes to run, depending on its complexity. These runs take an unusually large amount of time (compared to crnlV2) since they require re-indexing from scratch a large number of documents. The ba- sic procedure is to discover the top 1750 doc- uments for each query using the global sim- ilarity. Then each of those documents is re- indexed, breaking it down into its component parts (e.g., paragraphs). Then each compo- nent part is compared against the query to obtain local similarities. Other Experiments The Smart indexing procedures that are used in our experiments do not analyze the docu- ments or queries for negative terms such as 49 not. A query which explicitly requests doc- uments "not about the United Kingdom or Canada" will actually match any document with those terms. Removing the negative key- words results in insignificant improvement: 16 queries are helped, 16 are hurt, all in only a mi- nor fashion. These results suggest that other terms in the query were more important fQr locating the relevant documents. Earlier experiments with an on-line ency- clopedia ([14, 16]) demonstrated that preci- sion can be improved by discarding docu- ments which fail a local context check (cf. [1] where such documents were merely given lower similarity measures). That approach on the TREC 2 queries and collection yields almost exactly the same performance as crnlV2-b (see "sentence restricted" in Table 2). [1] dis- cusses probable reasons for the limited success of this method. Analysis of Ad-hoe Results The results of Table 2 suggest that there is little advantage to using local values in combi- nation with global matches. From run crulV2 to run crulL2 there is negligible improvement, crnlL2 does retrieve an additional 120-200 rel- evant documents. The retrospective runs using the Wall Street Journal sub-collection suggested there would be greater improvement between crnlV2 and crnlL2 than actually occurred. The most ob- vious problem is that the definition of a para- graph is sub-collection dependent. Our results were tailored to the WSJ sub-collection and probably did not apply well to the other sub- collections where "paragraphs" might be ex- tremely large.