IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Word-Word Associations in Document Retrieval Systems chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IX-37 vector by the association process equally important as a word in the original document). Weights somewhat below 1 are seen to be preferable, more so for precisions than for recall purposes. Fig. 10 indicates, in fact, that for high recall, weights above 0.5 do not cause as much loss in performance. To sum up, then, for high precision, one should have low weights and high cutoffs; for high recall, higher weights and lower cutoffs are desirable. Fig. 11 indicates recall-precision curves with high recall and high precision specifications; as expected, they cross. It was also seen in part 3 that additional iteration of the as- sociation process is not useful in finding synonyms, and it is also not of great value in retrieval. Fig. 12 shows curves for 0, 1, and 2 iterations of the association procedure, with frequencies of 6-50 and a cutoff of 0.60. The first iteration curve is seen to be superior. The performance differences shown by the various options in the association process are rather small. It is difficult, in parti- cular, to choose a set of options to maximize either the precision effect or the recall effect over an entire set of requests. Nor does a fine adjustment of cutoff, frequency, or weight have a major effect on retreival performance. This is just what is expected from the analysis of the associated pairs, since no set of parameters produces an unusual number of significant pairs. In general, the use of associated pairs produces improvement in performance over most of the range compared with word, stem matching if words with very low and high frequencies are omitted. Procedures which decrease the number of associated pairs (restricting the frequency range used, raising the cutoff) or lower the weight of the