IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Correlation Measures
chapter
K. Reitsma
J. Sagalyn
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
IV-13
as before.
The above discussion is necessary when describing the modifications
made to use the P-R-N formLlla for document - document correlations. With
weighted, document descriptions vectors the formula becomes
7[OCRerr]ixi
P-R-N =
7 (v[OCRerr]) 2 + 7 ([OCRerr]w[OCRerr])2 - 7vi[OCRerr][OCRerr]i
where the summations are taken from i = 1 to d , and where d equals the
number of concepts in the description vector.
The interpretation of this function is not simple. The closest
meaning which can be attached to the denominator is that it represents
twice the maximum weight of the inner product of the two vectors, assuming
perfect correlation, minus the actual inner product. The difference, there-
fore, will always be greater than or equal to the actual inner product.
By the argument previously used to determine the range of the
binary P-H-N function, the range of this function can be shown to be from
0 to 1 , inclusive.
G) The Stiles Coefficient
The Stiles function incorporates the parameter b as it was
defined for the [OCRerr]ron-Kuhns function. The formula is
n( Inbi - n2
(7vi)( Twi) (n[OCRerr] 7+ (n 7wi)
Since the formula was originally proposed to calculate index term - index
St =