SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Appendix A: TREC-2 Results Appendix National Institute of Standards and Technology D. K. Harman APPENDIX A This appendix contains tables of results for all the [OCRerr]II[OCRerr]EC-2 participants. The tables in Appendix A show varions measures of the performance on the adlioc and routing tasks. The acihoc results corne first, followed by the routing results, with the tables in the same order as the presentation order of the papers. The definitions of the evaluation measures are given in the Overview, section 4, and readers unfamiliar with these measures should read that section first. Care should be taken in comparing the tables across systerns. These measures show performance only, with no measure of user or system effort. Each table contains four major boxes of statistics and three graphs. Box 1-- Summary Statistics line 1-- unique run identifier, data subset, and query construction method used Data subset flill (disks 1 and 2 for adhoc, disk 3 for routing) category B (the official subset of data, 1/4 of the data using the Wall Street Journal articles for adhoc and the San 3ose Mercury News articles for routing) Query construction method automatic manual feedback (frozen evaluation used) line 2-- Number of topics included in averages. line 3-- Total number of documents retrieved over all topics. Here, "retrieved" means having a rank less 1001. line 4-- Total number of relevant documents for all topics in the collection (whether retrieved or not). line 5-- Total number of relevant retrieved documents for this run. Box 2-- Recall Level Averages lines 1-11-- The average over all topics of the precision at each of the 11 recall points given. Note that this is interpolated precision: e.g., for a particular topic, if the precision at 0.50 recall is greater than the precision at 0.40 recall, then the precision at 0.50 recall is used for both the 0.50 and 0A0 recall levels. line 12-- The average precision as calculated in a non-interpolated manner (see section 4 of the Overview for details on this calculation). Box 3-- Document Level Averages lines 1-9-- The average recall and precision after the given number of documents have been retrieved. line 10-- the R precision. This is a new evaluation measure being tried that averages the precisions found for each topic at the document level of R, where R is the number of relevant documents for that topic. A-i