SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Description of the PRC CEO Algorithm for TREC-2 chapter P. Thompson National Institute of Standards and Technology D. K. Harman Description of the PRC CEO Algorithm for TREC-2 Paul Thompson PRC Inc., Mail Stop 5S3 1500 PRC Drive McI-can, VA 22102 Phone: 612/687-4650 E-mail: thompson@research.westlaw.com (current affiliation: West Publishing Company) Abstract This paper describes an application of the Combination of Expert Opinion technique to combine the results of multiple retrieval methods used on the TREC-2 collection. The methods being combined were weighted by their TREC-1 performance. 1. Introduction This paper describes work done on the TREC-2 project at PRC Inc. in collaboration with Professor Edward Fox and his colleagues at Virginia Polytechnic Institute and State University (VPI&SU). The reader should refer to the description of their system included in these working notes for further details on the common processing of the TREC-2 data shared by PRC and VPI&SU (Fox et al. 1993). PRC used its algorithm, the Combination of Expert Opinion (CEO), to combine the results of VPI&SU's runs. VPI&SU used a different combination technique for their final results. Originally the intent was that the CEO algorithm would be integrated with the SMART system used by VPI&SU. Both upper and lower level combination of results would take place, i.e., at the lower level of individual document features within a particular retrieval method and the upper 271 level of combination of the output of the individual methods themselves, i.e., the various cosine and p-norm methods used by VPI&SU. For TREC-1 we were not able to train the CEO algorithm, so that the weighting of the various methods would be optimized based on relevance judgments. For TREC-2 we used the 11 point average scores obtained by the various methods for TREC- 1 for weighting. Again we only used the upper level of CEO. For TREC-1 we found that combining all methods resulted in lower performance than using the single best method. This year our first version combined the top two methods, based on TREC- 1, while the second version used the top five methods. 2. Combination of Expert Opinion The statistical technique of CEO provides a solution to the problem of combining different probabilistic models of document retrieval. This technique is expected to result in improved precision and recall over that provided by any one model, or method, since research has shown that various retrieval models retrieve different sets of more or less equally relevant documents (Katzer et al. 1982, Fox et al. 1988). In the