ISR10
Scientific Report No. ISR-10 Information Storage and Retrieval
The Query-Document Matching Function
chapter
Joseph John Rocchio
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
4-4
Keywords
a b C d e f
documents
d1
d
2
d3
d
4
1
1 1
1 1 1
1 1
1
1
1
1
1
[OCRerr]ery 1 1 1
1
1
a) Query and document keyword images represented by a binary
occurrence matrix.
d1 [OCRerr]
d2 D q R = [OCRerr]dI[OCRerr] : d[OCRerr] [OCRerr]
*d3 D q R [OCRerr]d2, [OCRerr]
d4 [OCRerr]
b) Retrieval by set inclusion matbhing.
L
n([OCRerr]'(\ d) n(q [OCRerr] d)/n(q [OCRerr]
d1 2 d1 2/5
d2 d2.
3/5
d4 1 d4 1/5
c) Relevance. values assi[OCRerr]ned
by overlap correlation.
Set Image Matcbing Operations
Fignre 4.1
d) Relevance values
assigned by metric
corr. (1-metric d[OCRerr]s[OCRerr].)