ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval The Query-Document Matching Function chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. form of storage organization or document classification will be necessary to achieve economic retrieval from large collections with response times fast enough for a real time environment. Classification may be regarded as a part of the general problem of content analysis When a document is classified under some given subject heading, its information content has been found to be related to that area of discourse. A 9lassification system, however, is rarely used for retrieval in the sense that a user can be satisfied by all the references assigned to some given category. The classification schedule in general provides a means of storage organization which allows a user to limit the scope of his search. In this sense the process of document classification is analagous to the document indexing process. The index image of a document characterizes the information content of that document while a classification category normally characterizes the information content of some area of di[OCRerr]course in the general field of knowledge. The assignment of some set of documents to a categorythen,.in effect, creates an index image for the information content of the entire set. The user matches his information needs against the categories of the classification system to select subsets of documents in the same wa[OCRerr]½ in which his search request is matched with individual document representations to select particular references. Tbua in automatic document retrieval systems, as in conventional library systems, document classification provides the key for a storage organization which can effectively limit the number of references which must be examined in detail in a given retrieval operation.