SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Combining Evidence from Multiple Searches chapter E. Fox M. Koushik J. Shaw R. Modlin D. Rao National Institute of Standards and Technology Donna K. Harman 7.3 Possible enhancements ec B ause of the disk space problem we were not able to do many of the tests we planned to do, even by the end of Phase 2. Work will continue in 1993 now that new disks have been received. Among the planned tasks are: * Phrase Identification and Matching In the current system, phrases are handled by using an AND query term. For example, Information AND Retrieval was used for Information-Retrieval. Due to the absence of a proximity operator, this leads to retrieval of non-relevant documents where these words occur widely apart. The retrieval results can be improved by providing a mecharnsm for dealing with phrases explicitly, and/or the use of proximity operators. * Better Base Methods In addition to considering the use of phrases, further study of base runs, considering query construction, indexing, weighting schemes, and lexical information, will be undertaken. Of particular interest is the use of p-norm queries, which if tailor-made, might well out-perform the vector queries in all collections. Contrasts between stemming and morphological analysis can also be made. * Merging Methods While the Ad Hoc and the R-P Merge methods are not based on elaborate theory, they do provide insight into the effects of combining results. Further refinement of the approaches, and additional testing to obtain upper-bound performance values, will be undertaken. * The CEO Model The Combination of Expert Opinion (CEO) model [3, 4J of Thompson can be used to treat the different retrieval runs as experts, and combining their weighting probability distributions to improve performance. This could be used in a variety of ways to combine results from a variety of runs and indexing schemes. References [1] C. Buckley. Implementation of the SMART information retrieval system. 85-686, Cornell University, Department of Computer Science, May 1985. Technical Report [2] FirstMark Technologies Limited. KnowledgeSEEKER User's Guide. FirstMark Technologies, 14 Concourse Gate Site 680, Ottawa, Ontario, Canada, 1990. [3] P. Thompson. A Combination of Expert Opinion approach to probabilistic information retrieval, Part 1: The conceptual model. Information Processing [OCRerr] Management, 26(3):371-382, 1990. [4] P. Thompson. A Combination of Expert Opinion approach to probabilistic information retrieval, Part 2: Mathematical treatment of CEO Model 3. Information Processing & Management, 26(3):383-394, 1990. 328