IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Word-Word Associations in Document Retrieval Systems
chapter
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
IX-6
Frequency of Number of Number Average Approx. Percent
One Word of Words of of Number Weight in Weight
the Pair That Pairs of Pairs Text in
Frequency Per Word Expansion Expansion
1 1179 18894 16.0 18900 17
2 382 6924 18.1 13800 12
3 199 1855 9.3 5600 5
4 126 1003 8.0 4000 4
5 103 957 9.3 4800 4
6 83 588 7.1 3500 3
7 61 554 9.1 3800 3
8 55 416 7.6 3300 3
9 41 254 6.2 2300 2
10 34 229 6.7 2300 2
11-14 87 819 9.4 9800 9
15-19 54 293 5.4 5300 5
20-29 86 481 5.6 12000 11
30-39 43 87 2.0 3000 3
.40-49 25 101 4.0 4500 4
50-59 18 105 5.8 5800 5
60-69 13 32 2.5 2100 2
70-79 8 3 0.4 200 0
80-89 7 34 4.9 2900 3
90-99 6 2 0.3 200 0
100-124 6 5 0.8 600 1
125-149 5 39 7.8 5400 5
150-174 1 0 0.0 0 0
175-199 2 1 0.5 200 0
200+ 4 4 1.0 1000 1
all 2628 28680 10.91 111500 100
Word-Word Associations Tabulated by Word Frequency
Table 1