ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
The Query-Document Matching Function
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
4-1
CHAPTER 4
QUERY-DOCUNE[OCRerr] [OCRerr]TCHING FCTNCTIO[OCRerr]
1. The Comparison 0£ Structural Operands
In selecting re£erences £rom a library collection, the user
matches his information needs [OCRerr]ainst the discernable information
content 0£ source doc[OCRerr]ments (or tokens representing them)e In a
mechanized document retrieval system an anal[OCRerr]ous process is
implemented using the £ormal representations (index trans£orms) 0£
the user1s information requirements and the content 0£ re£erence
documents. It is di££icult to characterize precisely the nature 0£
the comparisons which the user has at[OCRerr]his disposal because of the
richness in information carrying elements present in the natural
langnage and because 0£ the complexity 0£ human decision making. In'
automatic systems, however, comparison operations are closely related
to the structure 0£ the data representations 0£ the compared items.
In an automatic document retrieval system, then, the criteria for
selecting re£erence documents in response to user queries are directly
related to the data structures produced by the index trans£ormation.
A variety 0£ data structures has been considered £or
information representations in document retrieval (see Chapter 2).
Perhaps the simplest 0£ these is the unordered collection 0£ elements
such as results with a keyword representation 0£ document content.
When both the query and document representations are sets from a finite