IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. [OCRerr]) These conclusions are further supported by the almost equivalent values of the standard deviations of the respective functions. 5) The other functions show a performance strictly below the above mentioned coefficients. Cnly the Cverlap coefficient approaches the three best and only above 0.75 recall which region is fairly insignificant in practice. However, the four best functions, when tested with the Cranf ield Collection, exhibit a different behavior: 1) The differences between the functions have increased. 2) The Cosine function shows a better performance than the other three (i.e. the Parker-Rhodes-[OCRerr]eedham, Stiles, and Overlap coef- ficients). 3) The ?arker-Rhodes-Needham is not close to the Cosine anymore; it is the worst of the four. 14) The performance of the Overlap is no longer the worst, in fact, it remains very close to the Cosine and Stiles coefficient. 5) The standard deviation of the Cosine function is much smaller than for the other functions. This supports the conclusion that this function is better than the rest in this collection. 6) The overall precision at the same recall is lower in the Cran- field collection than in the ADI collection. 6. Discussion In this section, an attempt is made to explain the behavior of the various coefficients and to suggest possible modifications f[OCRerr]r future inves- tigat ions. Cosine: The Cosine function shows a consistently high performance in both the ADI and Cranfield collections. Since it is length dependent and since the Hypersine tries to reduce this dependence unsuccessfully, a compromise some-