NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)

SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) TREC-2 Document Retrieval Experiments using PIRCS chapter K. Kwok L. Grunfeld National Institute of Standards and Technology D. K. Harman New Routing Results No Trained % Trained % Trained % Trained % Trained Train. Exp 0 Exp 20 Exp 40 Exp 80 Exp 100 Total number of documents over all 50 queries Retrieved: 50000 50000 50000 50000 50000 50000 Relevant: 10489 10489 10489 10489 10489 10489 Rel_ret: 6551 6961 + 6.0 7496 +14.0 7646 +17.0 7695 +17.0 7712 +18.0 Interpolated P[OCRerr][OCRerr][OCRerr]11 - Precision Averages: 0.00 0.7200 0.7480 +4.0 0.8471 +18.0 0.8475 +18.0 0.8646 +20.0 0.8660 +20.0 0.10 0[OCRerr]124 0.5815 +13.0 0.6645 +30.0 0.6751 +32.0 0.6810 +33.0 0.6801 +33.0 0.20 0.4431 05254 +19.0 0.5981 +35.0 0.6116 +38.0 0.6135 +38.0 0.6115 +38.0 0.30 0.4016 0A728 +18.0 0.5371 +34.0 05413 +35.0 0.5465 +36.0 0.5452 +36.0 0.40 0.3486 0.4402 +26.0 0.4751 +36.0 0A774 +37.0 0.4829 +39.0 0.4878 +40.0 0.50 0.2970 0.3862 +30.0 0A167 +40.0 0A288 +44.0 0.4229 42.0 0.4214 42.0 0.60 0.2382 0.3048 +28.0 0.3496 47.0 0.3681 +55.0 0.3699 +55.0 0.3690 +55.0 0.70 0.1945 0.2430 +25.0 0.2772 43.0 0.2880 48.0 0.2815 45.0 0.2843 46.0 0.80 0.1284 0.1865 +45.0 0.1911 49.0 0.1937 +51.0 0.1999 +56.0 0.2005 +56.0 0.90 0.0740 0.0860 +16.0 0.1130 +53.0 0.1144 +55.0 0.1219 +65.0 0.1238 +67.0 1.00 0.0119 0.0187 +57.0 0.0140 +18.0 0.0107 -10.0 0.0114 - 4.0 0.0171 +44.0 Average precision (non-interpolated) over all rel does 02905 0-3517 +21.0 0.3962 +36.0 Precision at: OAOSO +39.0 0.4084 +41.0 0.4095 +41.0 5 does: 0.5600 0.5760 +3.0 0.6960 +24.0 0.7160 +28.0 0.7320 +31.0 0.7280 +30.0 10 does: 0.5440 0.5820 + 7.0 0.6880 +26.0 0.6860 +26.0 0.6980 +28.0 0.7000 +29.0 15 does: 0.5173 0.5627 +9.0 0.6573 +27.0 0.6707 +30.0 0.6800 +31.0 0.6813 +32.0 20 does: 0A910 0.5510 +12.0 0.6470 +32.0 0.6540 +33.0 0.6630 +35.0 0.6610 +35.0 30 does: 0.4653 0.5313 +14.0 0.6147 +32.0 0.6173 +33.0 0.6240 +34.0 0.6267 +35.0 100 docs: 0-3698 0A396 +19.0 0.4824 +30.0 0A930 +33.0 0.4974 +35.0 0.5002 +35.0 200 does: 0.3049 0.3562 +17.0 0.3887 +27.0 0.3945 +29.0 0.4002 +31.0 0.4004 +31.0 500 does: 0.2038 0.2241 +10.0 0.2452 +20.0 02490 +22.0 0.2500 +23.0 0.2498 +23.0 1000 does: 0.1310 0.1392 +6.0 0.1499 +14.0 0.1529 +17.0 0.1539 +17.0 0.1542 +18.0 R-Precision [OCRerr]recision after R (= num[OCRerr]rel for a query) does retrieved): Exact: 0-3346 0-3942 +18.0 0.4251 +27.0 0A283 +28.0 0.4281 +28.0 0.4291 +28.0 Table 3: New Routing Results at Several Query Expansion Levels radciiig and selecting the n besL Moreover. these `nonbreak' docicnents total only 5225. less than 1,3 of 16114 relevants ueed and is ther[OCRerr]fore very efficienL CI[OCRerr]ere are actu[OCRerr]y 16400 relevants from Disk 1&2, but during processing a small perentage was lost). 2) All edges and their weights on the query side of the network are de:ened by the activations deposited by the relevant documents; this means the original query plays no part in their d:[OCRerr]tion. 3) Negative edge weights are set to small positive weights of 0.1. Forretrieval: 4) after ranking, several subdoeuments 241 of the same document ID may rank high. and we combine their largest three RSVs in the ratio of 1:0.2:0.05 as the single reported RSV for the whole documenL Previously we ignored the third. and the ratio for combining the largest two was differenL We choose to stop at two or three subdocuments because noise from long documents may creep back. Such tijaing of parameters led to the results in Table 3 for our latest routing results. We use the convention `Trained Exp K' to denote query expansion level K. with K=O meaning weight adaptation without adding new