IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Correlation Measures
chapter
K. Reitsma
J. Sagalyn
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
iv-6
various coefficients
A) The Inner Product
Perhaps the simplest matching function is the inner product. It
is defined for two vectors V and w as
v.w[OCRerr]=Z vw
-i-i
i
This is the same expression mentioned previously and denoted as (1) and
(24).
If v and w are binary vectors, then the inner product is equal
to the number of terms both vectors have in conmon. When weighted vectors
are used, much of the significance of the inner product is lost. It no
longer is a measure of the number of concepts found in both V and w
It does, however, give a relative measure of the total weight of the matching
concepts, although it poses some problems since it is not normalized. For
example if
the inner products are
= ( 0, 2,12)
w1 = ( 0,13, 1)
w2 = ( 0, 1, 3) ,
w = 2(13) + 12(1) - 38
-l
V W = 2( 1) + 12(3) - 38
-2
The inner product of V with both w1 and w[OCRerr] gives 38, even though the
two w-vectors are very different. Because of these problems, the inner
product is not used in the evaluation; it is mentioned here since it forms
the basis of several other coefficients.