IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-2~ where between the Cosine and H[OCRerr]ersine may prove effective is length depen- dency inhibits the efficiency of the Cosine function. H[OCRerr]ersine: It is seen that the performance of the H[OCRerr]ersine function is worse than the Cosine. Therefore, it seems as though the non-matching concepts of the document which were deleted in calculating the document vector length are indeed important. Evidently, some degree of length dependency is bene- ficial in a matching function and the }[OCRerr]ersine tries to eliminate this dependency incorrectly and to too great a degree. [OCRerr]ron-Kuhns: The performance of this function is far below the three good func- tions. There are two possible explanations. One is the problem of comple- mentation, i.e. the complement of a weighted vector may be defined in a better way. The second possible explanation is the importance of the non- zero non-matching weights. In this study, only the matching weights were complemented. It might be advisable to complement the zero weights in one vector for those concepts with non-zero weights in the other vector. It still does not seem advisable to complement all the zero weights for the same reasons as stated previously. Cverlap: The performance of the Overlap coefficient in the ADI and Cran- field varies drastically. The explanation may lie in the differences between the subject content of the two collections. Since the weights of the request are usually less than the weights of the document, the numerator is not strongly influenced by a matching concept with a very large weight in a