SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) A Single Language Evaluation of a Multi-lingual Text Retrieval System chapter T. Dunning M. Davis National Institute of Standards and Technology Donna K. Harman -5- 7. Overall Results Analysis of the results as returned by the automatic TREC scoring program shows that the CRL entry performed much better than might be expected, given the severe problems encoun- tered in actually rurming the tests. Even though the system was not able to complete a clean index of the AP textbase in time, and even though the merging process was severely flawed, and even though no relevance feedback was employed the system was able to perform on a credit- able level. Many of the last minute problems can be backed out of the results by examining the results for only the WSJ and Ziff textbases. The other primary databases are less interesting because: 1) The FR texts were significantly longer and thus caused severe problems in merging results. 2) The index of the AP textbase was not completed by the revised program in time. 3) Excluding the DOE textbase had little effect on the results (only three queries among the first 50 had relevant documents in the DOE textbase). When these exclusions are made, and a composite recall-precision plot is made it can be seen that, on the average, our system provides a maximum precision of about 60%, a minimum precision of about 40% and a maximum recall of about 40%. Unfortunately, there is a wild variability so that in a strong sense, no single average is a terribly valid picture of the systems performance. 197