ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
The Query-Document Matching Function
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
4-7
Comparison Operation Definition
E[OCRerr]uality [OCRerr] a = b a. = b. for i = 1,N
1 1
Vector Difference d = a - b d. = a. - b. for i = [OCRerr]
1 1 1
Ma[OCRerr]itude of the [OCRerr]= dl = FZ(ai - b[OCRerr])2[OCRerr]21
Vector Difference J
-1 a.[OCRerr]b
[OCRerr] An[OCRerr][OCRerr]lar Distance = cos [OCRerr]ai lb[OCRerr][OCRerr]
Comparison Operations on Vector Represented Operands
Table 4.2
carried by these more complex operand structures is the increased cost
in the re[OCRerr]uired comparison operations necessary to specify a retrieved
subset or to assi[OCRerr]n a value indicative of document relevance. The
discussion here will be primarily concerned with vector operands;
however, certain of the results derived will be a function not of the
operand structures but of the matching function itself, and will,
therefore, be applicable to matching functions of the type considered
re[OCRerr]ardless of the operands to which they are applied.
2. Stora[OCRerr] Or[OCRerr]anization
In principle, an automatic document retrieval system can be
characterized independently of any parameters of storage organization.
Given a description of the document and query representations and of