ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Query-Document Matching Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 4-1 CHAPTER 4 QUERY-DOCUNE[OCRerr] [OCRerr]TCHING FCTNCTIO[OCRerr] 1. The Comparison 0£ Structural Operands In selecting re£erences £rom a library collection, the user matches his information needs [OCRerr]ainst the discernable information content 0£ source doc[OCRerr]ments (or tokens representing them)e In a mechanized document retrieval system an anal[OCRerr]ous process is implemented using the £ormal representations (index trans£orms) 0£ the user1s information requirements and the content 0£ re£erence documents. It is di££icult to characterize precisely the nature 0£ the comparisons which the user has at[OCRerr]his disposal because of the richness in information carrying elements present in the natural langnage and because 0£ the complexity 0£ human decision making. In' automatic systems, however, comparison operations are closely related to the structure 0£ the data representations 0£ the compared items. In an automatic document retrieval system, then, the criteria for selecting re£erence documents in response to user queries are directly related to the data structures produced by the index trans£ormation. A variety 0£ data structures has been considered £or information representations in document retrieval (see Chapter 2). Perhaps the simplest 0£ these is the unordered collection 0£ elements such as results with a keyword representation 0£ document content. When both the query and document representations are sets from a finite