SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) UCLA-Okapi at TREC-2: Query Expansion Experiments chapter E. Efthimiadis P. Biron National Institute of Standards and Technology D. K. Harman and corroborate the results obtained from the sign tests. The two tests indicate that r[OCRerr]lohi and r[OCRerr]hilo have performed consistently better than the other algorithms. 5 Conclusions * The results obtained from the use of the standard and enhanced versions of the GSL indicate that further r[OCRerr] search is needed in order to determine the effectiveness of the GSL-synonym list in retrieval. * The combination of adding 10 terms from the 5 or 10 top ranked documents contributed to better retrieval performance. The other term/document combinations, i.e. adding 20, 30, or 40 terms from 15 or 20 documents, etc., had a negative effect on retrieval performance. * The results from the routing searches indicate that query expansion (i.e., feedback searches without rel- evance information, where X number of terms is ex- tracted from Y number of top ranked documents that are treated as relevant to the query) improved retrieval performance depending on the algorithm used. * The r[OCRerr]lohi algorithm (Efthimiadis, 1993a) improved retrieval performance in the routing runs when com- pared to the initial (baseline) search which did not involve either a feedback search or query expansion. * In the Ad hoc searches the results of the evaluation of the five ranking algorithms indicate that r[OCRerr]lohi per- formed better than the other algorithms. These results were further validated by the results obtained from the sign test and the t-test. * Although query expansion seems to work, the retrieval performance achieved was less than expected. There are many reasons that account for these results and which are briefly addressed below. 1. Completeness of the TREC Queries: The major factor that is being attributed to these results is that the queries, i.e. TREC Topic De- scriptions, are almost complete, i.e. contain all the important words required for the search. Query expansion is the process of supplementing the original query terms and is particularly effec- tive when incomplete queries are available. Query expansion on these rather complete queries seemed to have contributed to a small or even a detrimental effect in overall retrieval perfor- mance. 285 2. Size of the TREC collection: The large size of the TREC collection raises the issue of scalability and effectiveness of retrieval algorithms. The TREC collection is very differ- ent from that of the standard IR test collections, such as ADI, Cranfield, CACM, NPL. TREC is 1-4 Gigabytes of text whereas the other collec- tions are smallish in size, i.e., only a few (1-50) Megabytes. The behavior and effectiveness of al- gorithms in information retrieval has been stud- ied in small collections and TREC provides the challenge of scalability. 3. Nature of documents: The documents in the WSJ database are mostly long documents; full-text as opposed to short bibliographic records; less structure when com- pared to bibliographic records; and with lan- guage and presentation less structured (journal- istic style compared to scientific style); 4. Length of documents: The records are long and often contain short multi-story, usually unrelated, items. When such documents contain relevant informa- tion for a topic, i.e., when one of the stories is relevant but all the others are not, these increase noise and interfere with the selection of terms for query expansion. This is because all the terms of that document will be included in the pool of the terms for query expansion and there may be a number of terms from other stories in that doc- ument that will be ranked higher than the terms from the relevant story. This reinforces the need to be able to retrieve at a paragraph level rather than at a document level. 6 Future Research * evaluate in detail the level of the effect of the GSL- synonym list in retrieval performance * evaluate the different effect of a local versus a global thesaurus for query expansion * evaluate the effect of variable bias in query expansion term weighting * investigate the retrieval overlap between different ap- proaches, and * explore data fusion techniques for output integration