IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Search Matching Functions chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 111-2 in one place initially, and then seeking other likely places through which to extend the search if necessary. The same procedure is followed in using a KWIC (Keyword-In-Context) index, which although mechanically produced is usually manually searched. An example of a search request and some of the strategies used to perform a search through various types of indexes is given in Figure 1. The main characteristic of these manual systems is that the indexes are designed to be entered by the searcher in one place at a time only. Thus the subject headings and classification numbers used must repre- sent quite complex ideas in a single entry to cope with modern knowledge. Another type of manually searched index that has gained widespread acceptance is the type that allows entry into several parts of the file simul- taneously, and is designed to identify documents that are found in all of the places entered. These systems are known as co-ordinate systems, or better post-co-ordinate, since the documents retrieved are those which match the search terms of the request only if the terms are present in the documents in the required combinations. The processing of search requests in such systems requires not only a decision as to which vocabulary terms shall be used in the search, but also a statement of logical combinations of the terms, in terms of logical products (AND), logical sums (OR), and logical differences (NOT). An example of such a search formulation is given in Figure 2; although this example illustrates a mechanized system to be des- cribed, a similar search formulation could be used in a manually searched system. In these manual systems described, each entry into the index produces a set of documents that match the search formulation, usually called the retrieval set; the remainder of the collection is considered to be not retrieved. User satisfaction is related both to the finding of relevant