IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Correlation Measures chapter K. Reitsma J. Sagalyn Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IV-19 5. Experlinental Results The following functions have been tested with the use of the ADI collection, making four comparisons in each test as follows: Table 1 - Overlap, Cosine, Parker-Rhodes-Needham, Reitsma-Sagalyn. Table 2 - Average, Stiles, Reitsma-Sagalyn (sorted up), Cosine. Table 3 - Cosine, I[OCRerr]rpersine, Maron-Kuhns, Reitsma-Sagalyn (modified). with the Cranfield collection: Table 4 - Overlap, Cosine, Parker-Rhodes-Needham, Stiles. The tables contain averages, from which the average recall - precision graphs were made, and the standard deviation (S.D.D.) of the averages. The data in the tables are summarized in Figure 1 which shows the performance of all coefficients tested on the ADI collection. Recalling the discussion of the recall and precision measures as a means for evaluating the performance of different correlation coefficients, Figure 1 shows the following [OCRerr]utpute[OCRerr] 1) Three correlation functions exhibit a decidedly better perfor- mance than the others. They have been replotted on a larger scale on Figure 3 to show the difference in behavior in more detail. The functions are Stiles, Cosine and Parker-Rhodes-Needham. 2) In the recall interval 0-0.50 the Parker-Rhodes-Needham coefficient has a better performance than the other two; in the recall interval above 0.50 the performance of this function is worse than the others. This indicates that the Parker-Rhodes- Needham function gives the best results in a system with a cutoff value smaller than 0.50. 3) Comparing the Cosine and Stiles coefficients, the former has a better performance below 0.55 recall, while at higher recall values, the performance of both functions is almost identical. Therefore, in the entire interval, the Cosine coefficient is better than the Stiles function.