SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Combining Evidence from Multiple Searches chapter E. Fox M. Koushik J. Shaw R. Modlin D. Rao National Institute of Standards and Technology Donna K. Harman Table 1: Summary of Retrieval Runs ([ Name [OCRerr] Model [OCRerr] Sim. Function [OCRerr] Weighting Scheme 1'[OCRerr] bool Boolean Boolean binary term weights pnorml.0 p-norm p-norm binary term weights, p=1.0 pnorml.5 p-norm p-norm binary term weights, p=1.5 pnorm2.0 p-norm p-norm binary term weights, p=2.0 cosine.atn vector cosine aug[OCRerr]orm * idf cosine.nnn vector cosine tf inner.atn vector inner product aug[OCRerr]orm * idf inner.nnn vector inner product tf . Retrieval based on the p-norm model The Boolean queries described above were also used for the p-norm runs. Retrieval runs were made with three different p-values: 1.0, 1.5, and 2.0. No query term or clause weights were used during the p-norm runs. The different runs are summarized in Table 1. Note that in Phase 1 of our efforts, we used all eight runs listed. In Phase 2, however, we focused on the pnorml.O case, with document weighting, and the four vector runs. 3.2 Weighting Schemes The weighting schemes mentioned above are detailed in Table 2. Table 2: Weighting Scheme Options . Term frequency normalization. This has the following choices: (n)one newif = tf (b)inary newif = 1 ________________ ________[OCRerr] if (m)ax[OCRerr]norm newif = m[OCRerr]x[OCRerr]if (a)ug[OCRerr]orm newif = 0.5 + 0.5 * if ______________ max[OCRerr]if . Document weights. This has the following choices: f{(n)oneT new[OCRerr]wt = newif I(t)fldf new[OCRerr]wt = newif * [OCRerr] (p)rob new[OCRerr]wt = newif * 1()9(num[OCRerr]0o1C1s;fC;6lQl[OCRerr]fre[OCRerr]) . Document vector normalization. This can be either of: `1(n)onei[OCRerr]Th[OCRerr]wt=new[OCRerr]wt [OCRerr] This allows for a very flexible approach to changing the document vector weights as can be seen from Table 3. 321