SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Okapi at TREC-2 chapter S. Robertson S. Walker S. Jones M. Hancock-Beaulieu M. Gatford National Institute of Standards and Technology D. K. Harman Table 1: Effect of varying query term sources (no query term frequency component) Query % of tops where source ave lgth Ave Prec Prec at 5 Prec at 30 Prec at 100 R-Prec Recall AveP > median TC 30.3 0.300 0.624 0.536 0.440 0.349 0.683 66 C 26.7 0.296 0.636 0.524 0.436 0.346 0.686 58 TCD 39[OCRerr]7 0.297 0.592 0.519 0.429 0.340 0.667 62 TCND 81.0 0.263 0.612 0.485 0.394 0.306 0.605 48 TCN 71.6 0.262 0.624 0.481 0.397 0.309 0.604 50 TCNDDef 86.3 0.257 0.580 0.468 0.387 0.303 0.604 46 TN 44.9 0.181 0.500 0.418 0.320 0.245 0.491 26 TND 54.4 0.179 0.492 0.403 0.317 0.243 0.491 24 TD 13.1 0.170 0.428 0.381 0.297 0.244 0.492 28 T 3.6 0.165 0.380 0.343 0.271 0.233 0.471 32 Terms: single. Document termweights: BM11. Database: disks 1 and 2. Topics 101-150 Query average length is the average number of terms taking account of repeats Table 2: Effect of varying query term sources (with query term frequency component) Query Weight % of tops where source function AveP PS P30 P100 RP Rd AveP > median TCND BM11 0.360 0.652 0.569 0.479 0.401 0.754 92 TCN BM11 0.356 0.644 0.565 0.482 0.399 0.749 92 TCNDDef BM11 0.354 0.648 0.559 0.474 0.395 0.751 92 TCD BM11 0.353 0.644 0.565 0.481 0.394 0.750 90 TC BM11 0.335 0.636 0.560 0.468 0.375 0.723 86 TC BM1S 0.284 0.560 0.485 0.416 0.336 0.685 56 TND BM11 0.283 0.556 0.503 0.414 0.338 0.652 60 TN BM11 0.274 0.556 0.497 0.399 0.331 0.643 56 TC BM1 0.232 0.504 0.435 0.361 0.289 0.601 28 Document term weights were multiplied byqif, equivalent to large k3 in eqn 6 Terms: single. Database: disks 1 and 2. Topics 101-150 Table 3: Effect of different document term weighting functions: single terms and adjacent pairs Weight % of tops where function Terms AveP PS P30 P100 RP Rd AveP > median BM1 1 singles + "natural" pairs 0.307 0.628 0.541 0.448 0.358 0.696 62 BM 11 singles + all adj pairs 0.304 0.612 0.544 0.447 0.357 0.694 62 BM11 singles 0.300 0.624 0.536 0.440 0.349 0.683 66 BM1S singles 0.227 0.500 0.434 0.351 0.285 0.595 38 BM1 singles 0.199 0.468 0.416 0.326 0.261 0.542 22 BMO singles 0.142 0.412 0.336 0.270 0.209 0.411 12 "Natural" means adjacent in the same sentence of the topic with no intervening punctuation Query term source: TC. qif component: none. Database: disks 1 and 2. Topics: 101-150 26