IRS13
Scientific Report No. IRS-13 Information Storage and Retrieval
Word-Word Associations in Document Retrieval Systems
chapter
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
IX-18
associations appears to offer no advantages over first-order associations,
and loses a great amount of material.
We can therefore conclude that the use of associative procedures
for the determination of word meaning in a general sense is not advisable
with moderate sized collections of text, since the vast majority of the
associations produced reflect specific local meanings of words. No
choice of word frequencies or associative procedures appears to offer a
way around this difficulty.
4. Retrieval Experiments
Although the association process is not suited for investigations
of absolute word meanings, it is nevertheless useful in retrieval systems.
Fig. 4 shows a comparison between word-word association retrieval runs
and straight word stem matching for three collections. It is seen that
for two of the collections (the ADI and IRE collections) the improvement
offered by associative strategies is only over small ranges and of
doubtful significance. For the Cranfield collection, the associative
strategy shows a definite superiority.
The purpose of the associative method, originally, is to produce
word relations missed by the stem matching procedure, and thus to take the
place of a synonym list or thesaurus. It would be expected that such a
procedure would be a recall-oriented device. However, this is not quite
what happens. Associative procedures improve performance in two distinctly
different ways First, they do occasionally retrieve a document that is
missed in stem matching, by introducing new word relations which provide
some request-document overlap. More often, however, precision is improved