IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-25 document. This insensitivity may explain the poor [OCRerr]rformance of this func- tion compared to some of the other coefficients. Farker -Rhodes -Needham: The strikin[OCRerr] difference in performance of this function in the ADI collection, where it proved very powerful, and in the Cranfield collec- tion where it performed rather poorly is puzzling. Further evaluation with other document collections is needed before any conclusions as to its value can be made. Stiles: This coefficient shows a consistent high performance for both the ADI and Cranfield collections. It is far less sensitive to variations in collection characteristics than the Overlap and the Parker-Rhodes-Needham coefficients. The explanation of this phenomenon is difficult due to the complexity of the formula; however, its quasi-binary character seems to give reasonable results. One possible refinement may be a better definition of N Reitsma-Sagalyn: Three different modifications of this formula were used in this study. In one of them N equals the number of concepts in either the query vector or the document vector (the maximum of the two). Another form results in using the number of matching concepts for N . When this is done, it is observed that many relevant documents occur at the end of the ranked list. This leads to the third modification in which the second form was used but were ranked in the reverse order. In general, this formula proved the documents ineffective.