SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) An Information Retrieval Test-bed on the CM-5 chapter B. Masand C. Stanfill National Institute of Standards and Technology D. K. Harman Table 2: Adhoc Queries Method Precision Average at 100 docs precision Case: tm[OCRerr]-adhoc-dcwp-idf- caps-wt .2002 .1276 tmc8-adhoc-dcwp-idf- lower-wt .1734 .1157 Document4ength and inverse-para-scaling: tmc8-adhoc-dcwp-idf- Iower-doc-length-wt .3308 .1904 tmc8-adhoc-dcwp-idf- caps-ip-wt .3432 .1939 tmc8-adhoc-dcwp-idf- lower-ip-wt .3422 .2027 tmc9-adhoc-etwp-idf- caps-ip-wt .3144 .1736 Stemming: tmc8-adhoc-dcwp-idf lower-stem-wt .1670 .1152 tmc8-adhoc-dcwp-idf- lower-stem4p-wt .3240 .1980 Proximity: tmc8-adhoc-dcwp-idf- caps-doc-length-sent- prox-wt .3436 .2012 tmc8-adhoc-dcwp-idf- lower-doc-length-sent[OCRerr] prox-wt .3518 .2146 tmc8-adhoc-dcwp-idf- caps-ip-para-prox-wt .2892 .1681 tmc8-adhoc-dcwp-idf- lower-ip-para-prox-wt .3006 .1772 tmc8-adhoc-dcwp--idf- caps-ip-sent-prox-wt .3476 .1988 tmc8-adhoc-dcwp-idf- lower-ip-sent-prox-wt .3602 .2164 121 The query tmc8 (dcwp) consisted of words and phrases from the description and concept sections of the topic tern- plates. Query tmc9 (etwp) used words and adjacent phrases from the entire topic. Bold4ace acronyms emphasize particu- lar experiments with case (caps and lower), sentence and paragraph level proximity (sent-prox and para-prox) docu- ment length scaling (doc-length), inverse weights based on paragraph position (ip) and weight thresholds (wt). The que- nes were not changed for the different experiments. Idf weighted terms from the description and concept sec- tions taken together (query tmc8) seem to do better than those derived from the entire topic (query tmc9). C. Case For the adhoc queries, we compared indexing with and without preserving case (similar treatment for the queries). Except for the simplest experiment with weight thresholds, converting everything to lower case seems to yield compara- ble or better results than upper case. A similar experiment for routing queries wasn't attempted because that would have requited reformulating and reoptimizing the routing queries. D. Stemming Using the Porter Algorithm software for stemming from [16] we experimented with stemming at index time (and stem- ming the queries). We found that stemming reduces perfor- mance when compared with sintilar experiments using lower case -- since the software we had used lower case. We are not sure yet why there is such a decrease. E. Document length and Term position Document length scaling was used to explore the effect of emphasizing shorter documents. A linear decreasing scaling for longer documents, with a tail was used. An inverse weight based on the paragraph the term appears in, was also explored. Both the document length scaling and the inverse paragraph scaling increase performance significantly and seem to be comparable to each other. F Proximity The postings for the inverted file allow use of term position. Experiments are underway to deline proximity scoring meth- ods that enhance weights for terms appearing close together (clusters of terms), and can also be implemented efficiently within the current architectm[OCRerr]. We have achieved good results with sentence level proximity measures based on a bonus score for the query terms that appear within the same sentence and within a certain distance of each other. The bonus is also proportional to the term weight itself. Experiments that used a bonus independent of term weight dramatically reduced per- formance (numbers not reported here), possibly due to noise introduced by clusters of unimportant terms. Sintilar experi- ments with paragraph level proximity yielded significantly poorer results as compared to sentence level proximity. Finally combining either document length or inverse para- graph scaling with sentence level proximity improved results.