SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Retrieval Experiments with a Large Collection using PIRCS chapter K. Kwok L. Papadopoulos K. Kwan National Institute of Standards and Technology Donna K. Harman B,C,D are index terms with weights b,c,d respectively, A,V are the boolean AND and OR each given a fixed weight p=2, and all clause weights are set to I. If document di also has these three terms with weights 0 <= b',c',d'<= 1, then the similarity between di and CL could be evaluated recursively as: x `= Sim(di,X) = sqrt[((cc')2 +(dd')2)i([OCRerr]+d2)), Sim(di,CL) = 1 - sqrt[ [OCRerr]2(l[OCRerr]b[OCRerr])2+l*(l[OCRerr]x')2 )/(1)2 +1)) (5) All document and query term weights are taken from the edges of the net, so that the system is fully automatic once the boolean expression has been defmed manually. Our retrieval results for automatic query construction is then based on combining methods (a) and (b), thus: W[OCRerr]ILito = (W[OCRerr] + V[OCRerr])/2. Those for manual query construction is based on w1man = r*Wiauto + s*sim(di,CL). Both make use of combination of retrieval methods. The constants r, 5 are chosen as 0.65 and 0.35 respectively. The objective is that adding soft-boolean structure may enhance the retrieval results of the automatic method for the same queries. Our soft-boolean evaluation algorithm currently only accounts for terms that also appear in the network for this query; additional terms that may have been inserted manually are ignored. 2.4. Network Implementation with Learning 2.4.1 Network for Routing, Ad Hoc and Feedback without/with Query Expansion The use of a network can provide a unified view of many retrieval algorithms and is a flexible tool for implementation. In PIRCS, retrieval methods (a) and (1') of the previous section are implemented as feedforward and feedbackwards processing in a Query-Term-Document (Q-T-D) network as presented in [8,9]. A binary tree representing a boolean expression can also be hung onto the net for method (c). These are shown in Figs.l,2. QTD DTQ cia w ak ifi A -A Q Fig.i: 3-Layer PIR Network tk --0---- ThWkiWik T ~L] LI LI D The edges of the net are initialized as follows: w[OCRerr] = d[OCRerr]I[OCRerr] as in Eqn.2, and similariy for w[OCRerr] and w[OCRerr]. d1 ½ Fig[OCRerr]2: Soft-Boolean Query Network tk 0 w T F] LI LI D (tk acting on q,) as in Eqn. 1 and wkl (d[OCRerr] acting on tk) Activation on d[OCRerr]=l gated through wkl deposits on tk 157 OTO Soft- Boolean d