IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval An Experiment in Automatic Thesaurus Construction chapter R. T. Dattola D. M. Murray Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. viii-16 THS 1 THS 2 Total number of concept classes............ 156 289 Avg. number of concepts per class.......... 3.9 1.4 Number of concepts appearing in more than one concept class 167 42 Number of concepts appearing in more than six concept classes 3 0 Avg. number of classes per concept . . 2.1 1.2 Avg. standard deviation (S.D.) of concept frequencies per concept class 3.9 1.4 n avg. S.D. = l/n[OCRerr](l/m [OCRerr]jA - f[OCRerr]I) where, j=l = total number of concept classes m = number of concepts in concept class j A = avg. frequency of concepts in concept class i f. = frequency of concept j in concept class i Statistics on Automatic Thesauruses Fig. 5