NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)

SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Optimizing Document Indexing and Search Term Weighting Based on Probabilistic Models chapter N. Fuhr C. Buckley National Institute of Standards and Technology Donna K. Harman For each phrase occurring in a document, indexing weights for the phrase as well as for its two components (as single words) were computed. In the following, we will refer to the indexing with single words only as "word indexing" and to the indexing using both single words and phrases as "phrase indexing". 3 Retrieval For the probabilistic document indexing weights as described above, there are specific retrieval func- tions which yield a probabilistic or utility-theoretic ranking of documents w.r.t. a query. These func- tions differ in the aspect whether or not they consider relevance feedback information for the specific query. For this reason different retrieval functions were applied for the ad-hoc queries and the routing queries. 3.1 Ad-hoc queries For the ad-hoc queries, we used a linar utility-theoretic retrieval function described in [Wong & Yao 89). Let qT[OCRerr] denote the set of terms occurring in the query, [OCRerr] the set of document terms and ti[OCRerr],,,, the indexing weight u(x[OCRerr]i[OCRerr], dm)). If c[OCRerr]k gives the utility of term t[OCRerr] for the actual query q[OCRerr], then the utility of document dm w.r.t. query q[OCRerr] can be computed by the retrieval function Cik Uim. (1) t[OCRerr]qT[OCRerr]ndT[OCRerr] The only problem here is the estimation of the utility weights Cik, for which we do not have a theoreti- cally justified method. As a heuristic approach, we used the SMART if.idf weights, where If denotes the number of occurrences of the term I[OCRerr] in the query q[OCRerr] here [Salton & Buckley 88]. We assume that there are other choices for this parameter which could significantly improve retrieval quality. run if.idf [OCRerr]uhra1 [OCRerr]uhrpi (words) (phrases) average precision for recall points: 11-pt Avg. 0.1750 0.1943 0.2054 3-pt Avg. 0.1159 0.1399 0.1664 query-wise comparison with median: 11-pt Avg: 32:14 32:17 Prec. @ 100 docs 25:21 35:13 Best/worst results: 11-pt Avg: 6(2)11(1) 4(1)12 Prec. @ 100 docs 2(2)12 8(3)1(2) single words vs. phrases: 11-pt Avg: 25:25 Prec. @ 100 docs 21:28 Tabelle 1: Results for adhoc queries The retrieval function (1) was applied for both kinds of document indexing, where words only were considered for run fuhral and both words and phases in the run fuhrpi. Figure 2 shows the recall- precision curves for both runs in comparison to a standard SMART if idf run. It can be seen that 93