IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. `v-il concept weight in the collection. If the number of concepts in the thesa[OCRerr]irus is large and the number of concepts in any document description vector is much smaller, a large number of zero elements will occur in a vector. When these elements are complemented, all the elements will equal the maximum concept number. In this case, the summation [OCRerr] will be very large and its product with 7v.w.*[OCRerr]vw Zv[OCRerr] will be much larger than giving a coefficient which will always be near 1. To avoid this problem, only non-zero concepts are complemented. In the ADI collection the maximum document weight is 96 and the maximum query weight is 1[OCRerr]. The complement for an element in a document vector or a query vector is respectively v = -i 96-v[OCRerr] if vi or [OCRerr]wi is greater than zero, otherwise the complement is zero. One further alteration made, in order to avoid negative correlation coefficients, results in a change in the range of the formula. It has been adjusted so that the range is from 0 tQ +1 by adding 1 to the unadjusted coefficient and dividing by 2 F) The Parker-Rhodes-Needham Coefficient This formula was originally proposed as an index term - index term association measure for use with binary term vectors. The function is