IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Summary summary Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. a "super-thesaurus" is generated for the whole collection by merging the individual term groupings obtained from the subcollections. Retrieval experiments show that such a fully-automatic super-thesaurus produces better retrieval results than manually constructed thesauruses. Statistical word-word association procedures are examined and evaluated in section IX by M. E. Lesk. Associative procedures produce groups of terms (or documents) based on the co-occurrence of the terms within the documents of a collection, or within the sentences of a document. The effect is then similar to that of a thesaurus, except that the construction method is automatic. The data included in section IX show that the associative method furnishes results which are essentially independent from those obtained by a normal thesaurus procedure. The associative term groups are unrelated to the thesaurus groups, and there appears to be no basis for the conjecture that second order term associations are equivalent to synonym groupings. Like the synonym groupings of a thesaurus, word-word associations do occasionally improve the recall of a retrieval system; they also improve the precision by promoting certain relevant documents to higher rank positions. A detailed analysis of the search requests used with the ADI documentation collection is contained in section X by E. M. Keen. Various characteristics of the search requests are examined, including criteria for identifying unclear request statements, requests expressing multiple needs, requests with identifiable important words, requests with restrictive negative statements, and so on. Using these characterizations, certain xvi