SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Okapi at TREC-2
chapter
S. Robertson
S. Walker
S. Jones
M. Hancock-Beaulieu
M. Gatford
National Institute of Standards and Technology
D. K. Harman
Table 1: Effect of varying query term sources (no query term frequency component)
Query % of tops where
source ave lgth Ave Prec Prec at 5 Prec at 30 Prec at 100 R-Prec Recall AveP > median
TC 30.3 0.300 0.624 0.536 0.440 0.349 0.683 66
C 26.7 0.296 0.636 0.524 0.436 0.346 0.686 58
TCD 39[OCRerr]7 0.297 0.592 0.519 0.429 0.340 0.667 62
TCND 81.0 0.263 0.612 0.485 0.394 0.306 0.605 48
TCN 71.6 0.262 0.624 0.481 0.397 0.309 0.604 50
TCNDDef 86.3 0.257 0.580 0.468 0.387 0.303 0.604 46
TN 44.9 0.181 0.500 0.418 0.320 0.245 0.491 26
TND 54.4 0.179 0.492 0.403 0.317 0.243 0.491 24
TD 13.1 0.170 0.428 0.381 0.297 0.244 0.492 28
T 3.6 0.165 0.380 0.343 0.271 0.233 0.471 32
Terms: single. Document termweights: BM11. Database: disks 1 and 2. Topics 101-150
Query average length is the average number of terms taking account of repeats
Table 2: Effect of varying query term sources (with query term frequency component)
Query Weight % of tops where
source function AveP PS P30 P100 RP Rd AveP > median
TCND BM11 0.360 0.652 0.569 0.479 0.401 0.754 92
TCN BM11 0.356 0.644 0.565 0.482 0.399 0.749 92
TCNDDef BM11 0.354 0.648 0.559 0.474 0.395 0.751 92
TCD BM11 0.353 0.644 0.565 0.481 0.394 0.750 90
TC BM11 0.335 0.636 0.560 0.468 0.375 0.723 86
TC BM1S 0.284 0.560 0.485 0.416 0.336 0.685 56
TND BM11 0.283 0.556 0.503 0.414 0.338 0.652 60
TN BM11 0.274 0.556 0.497 0.399 0.331 0.643 56
TC BM1 0.232 0.504 0.435 0.361 0.289 0.601 28
Document term weights were multiplied byqif, equivalent to large k3 in eqn 6
Terms: single. Database: disks 1 and 2. Topics 101-150
Table 3: Effect of different document term weighting functions: single terms and adjacent pairs
Weight % of tops where
function Terms AveP PS P30 P100 RP Rd AveP > median
BM1 1 singles
+ "natural" pairs 0.307 0.628 0.541 0.448 0.358 0.696 62
BM 11 singles
+ all adj pairs 0.304 0.612 0.544 0.447 0.357 0.694 62
BM11 singles 0.300 0.624 0.536 0.440 0.349 0.683 66
BM1S singles 0.227 0.500 0.434 0.351 0.285 0.595 38
BM1 singles 0.199 0.468 0.416 0.326 0.261 0.542 22
BMO singles 0.142 0.412 0.336 0.270 0.209 0.411 12
"Natural" means adjacent in the same sentence of the topic with no intervening punctuation
Query term source: TC. qif component: none. Database: disks 1 and 2. Topics: 101-150
26