SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Overview of the First Text REtrieval Conference (TREC-1) chapter D. Harman National Institute of Standards and Technology Donna K. Harman Figure 9 shows the comparison of automatic and manual query construcLion on a per topic basis. It is interest- ing to note that, for the two systems shown, most topics show equal performance in terms of the percentage of relevant documents retrieved by 100 documents. Some topics, like topic 51, show much better manual perfor- mance, whereas other topics, like topic 69 show better automatic performance. This is somewhat different results from earlier comparisons of Boolean systems (usually manual indexing and manual query construction) versus the automatic systems such as the SMART system. In the Medlars study (Salton 1969) the manual (Boolean) systems seemed to do either very well or very poorly, whereas the automatic systems produced con- sistent t1medium" reSults. The difference in the WEC task is likely that the topics are very long arid complex, and sometimes are easy to express manually, but sometimes very difficult, whereas the automatic construction is hampered by the existence of difficult narratives. This is only a hypothesis and needs further investigation. 1 0.9 0.8 0.7 [OCRerr] 0.6 [OCRerr] 0.5 [OCRerr] 0.4 [OCRerr] 0.3 0.2 Adhoc Manual vs Automatic by Topic Ii Iih1L Jj II. hL 0.1 0 - I IL51K L ________ ___ ____________ Hz -[OCRerr] -[OCRerr] -[OCRerr] -[OCRerr] -[OCRerr]-[OCRerr]-,- . -[OCRerr] -[OCRerr] -[OCRerr] - 1111 ilili JIj[OCRerr][OCRerr] A i hullil 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 95 97 99 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 82 84 86 88 90 92 94 96 98 100 [OCRerr] fuhrpl U cnqst2 Figure 9. Adhoc Results using Automatic and Manual Query Construction. on a Per Topic Basis There were also some category B results for adhoc, and the best of these are shown in Figure 10, with results from the Cornell system run as a category B run to show some comparison. There is a wide spread in the curves here, with widely differing systems being shown. The [OCRerr]pirc[OCRerr]t[OCRerr] results represent a very successful relevance feedback method, with the "pircsl" being an automatic query construction using the same system (see Kwok, Padadopoulos & Kwan paper). The "nyuirl't results come from a system using natural language tech- niques (see Strzalkowski paper). 16