ISR10 Scientific Report No. ISR-10 Information Storage and Retrieval Search Request Formulation chapter Joseph John Rocchio Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. correlation matrix 0 H1[OCRerr]i[OCRerr]C [OCRerr].4. i[OCRerr] is assume[OCRerr] t[OCRerr][OCRerr]t th- [OCRerr].- =) [OCRerr] r r has been identified as a relevant [OCRerr] 4' 5[OCRerr] [OCRerr] in resŁonse to some initial search request, q0. These document L[OCRerr]a[OCRerr]es are correlated against each other producing the correlation matrix sho[OCRerr][OCRerr]. If this matrix is used as a basis for partitioning the by clustering technique, [OCRerr] subsets iLl = r[OCRerr]r and I[OCRerr] some r1 `2'5 2 R = r[OCRerr],r4 will result In this case then, the system can generate two new queries by using each of these subsets together with the non- relevant set S. Thus the following pair of new search requests can be formed: -1 q1 = n1n2q0 t n2 7 r[OCRerr] - n1 7 s[OCRerr] 1 1. and (5.11) -2 - +-. n2 7.; [OCRerr] - nt q1 = n1 n2q0 r[OCRerr].E[OCRerr]2 i . 1 (5.12) 7 [OCRerr] s.~S 1 where n1 = n(R1), n11' = n([OCRerr]2), and n2 = n(S). On the basis of a partition of the relevant subset identified by the user, two new search requests have been formed from a single original.request. Roughly, this procedure amounts to allowing the user to. identify particular documents in the collection and request additional references "like1' those he has singled out. Ry examining the degre'e of association among the `identified documents, it is possi- ble to "determine if" thi's can be done' `efficiently with a single search request or whether multiple searching is required.