IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Search Matching Functions chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. III-~0 range of weights in the vectors, that is, combinations of words may cause some concepts to be very highly weighted, and rare examples of weights in excess of 100 have been noted. The superiority of numeric on Cran-l Stem, which was left in some doubt on the average measures (although not in the individual request figures) is only marginally established by looking at the 198 individual relevant docu- ments separately. Some [OCRerr]l of the documents have identical ranks on numeric and logical, while 85 have better ranks on numeric, and 72 have better ranks on logical. The 85 that are better on numeric show larger increases in the rank positions involved, as shown in Figure 25. These large scale changes that work in both directions, some favoring numeric and some logical, are illustrated for one individual request by the data in Figure 26. Six of the ten highest ranked documents on logical receive rank positions below 1.0 in numeric; this large change in document ranks favors numeric in this example. In order to determine how the weighting scheme 18 used to achieve a more effective discrimination between relevant and non-relevant documents, further data on these 17 documents are given in Figures 27, 28, 29 and 30. Figure 27 shows the ordering resulting from logical (cosine), giving correlations, matching concepts and total document concepts. Figure 28 gives the numeric ordering, together with data about the matching concepts and document concepts from which the final correlation is derivede For example, the correlation given to document [OCRerr]20 (relevant) of 0.41421 is derived from: Cosine Numeric Correlation = Sum of matching concept doct. weights ~estwei~ `Isum of squares[OCRerr] [OCRerr] tsum of squares of req. wts.J [OCRerr]of doct. wts. - 0.4421 1158433x1235,424