SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) A Boolean Approximation Method for Query Construction and Topic Assignment in TREC chapter P. Jacobs G. Krupka L. Rau National Institute of Standards and Technology Donna K. Harman 8 Summary GE's participation in TREC involved a small implementation of a simple strat- egy for compiling knowledge based pattern matcher rules into the language of Boolean expressions. A statistical corpus analyzer helped to formulate and re- fine queries for both the ad hoc and routing tasks, and the resulting matching engine ran on the entire 2.3 gigabytes of text. The simple Boolean retrieval engine performed very well on both tasks. These results are promising, both from the perspective of accuracy and for the simplicity with which they seem to bring knowledge-based techniques to bear within the rudimentary framework of word-based retrieval. References [1] Paul S. Jacobs. Joining statistics with NLP for text categorization. In Proceedings of [OCRerr]he 3rd Conference on Applied Na[OCRerr]ural Lang[OCRerr]age Processing, April 1992. [2] Paul S. Jacobs, George R. Krupka, and Lisa F. Rau. Lexico-semantic pat- tern matching as a companion to parsing in text understanding. In Four[OCRerr]h DARPA Speech and Natural Language Workshop, pages 337-342, San Ma- teo, CA, February 1991. Morgan-Kaufmann. 308