IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Word-Word Associations in Document Retrieval Systems chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. IX-6 Frequency of Number of Number Average Approx. Percent One Word of Words of of Number Weight in Weight the Pair That Pairs of Pairs Text in Frequency Per Word Expansion Expansion 1 1179 18894 16.0 18900 17 2 382 6924 18.1 13800 12 3 199 1855 9.3 5600 5 4 126 1003 8.0 4000 4 5 103 957 9.3 4800 4 6 83 588 7.1 3500 3 7 61 554 9.1 3800 3 8 55 416 7.6 3300 3 9 41 254 6.2 2300 2 10 34 229 6.7 2300 2 11-14 87 819 9.4 9800 9 15-19 54 293 5.4 5300 5 20-29 86 481 5.6 12000 11 30-39 43 87 2.0 3000 3 .40-49 25 101 4.0 4500 4 50-59 18 105 5.8 5800 5 60-69 13 32 2.5 2100 2 70-79 8 3 0.4 200 0 80-89 7 34 4.9 2900 3 90-99 6 2 0.3 200 0 100-124 6 5 0.8 600 1 125-149 5 39 7.8 5400 5 150-174 1 0 0.0 0 0 175-199 2 1 0.5 200 0 200+ 4 4 1.0 1000 1 all 2628 28680 10.91 111500 100 Word-Word Associations Tabulated by Word Frequency Table 1