SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Retrieval Experiments with a Large Collection using PIRCS
chapter
K. Kwok
L. Papadopoulos
K. Kwan
National Institute of Standards and Technology
Donna K. Harman
activation equal to the proportion that the term is used in d[OCRerr]. These activated terms spread to a target (`8
gated through Wak which has the odds that given [OCRerr] that [OCRerr] is relevant. The sum of activations received
at [OCRerr] implements the query-focused retrieval method (a) of Section 2.3, while the reverse QTD direction
processing implements document-focused retrieval method (b). Moreover, edge weights from the net can
also be used to initialize the trec leaf nodes for soft-boolean evaluation in the DTQ direction, Fig.2.
Relevance feedback, which has been demonstrated in numerous experiments as the most effective tool for
improving retrieval, is modeled as training in our net. For example, the edge weight w[OCRerr] (w1[OCRerr] adapts
according to a learning algorithm based on the average activation Xk induced in term k by the relevant set,
the current p factor (see Eqn.l) on the edge, and a learning rate [OCRerr] (liD), thus:
DTQ: Awak = Ap[OCRerr][p[OCRerr]Id(l[OCRerr]p[OCRerr]Id)], [OCRerr] = liQ(xk[OCRerr]pakoId)
QTD: AW[OCRerr] = Ap[OCRerr][p[OCRerr]d(l[OCRerr]p[OCRerr]d)], [OCRerr] = liD(xk[OCRerr]pikold)
see (8,9] for details and Fig.3 illustrating the DTQ process. Learning in both directions are allowed: DTQ
query-focused training is done when we know the set of documents relevant to a query and corresponds
to probabilistic retrieval [10,11], and QTD document-focused training is done when we know the set of
queries relevant to a document and corresponds to probabilistic indexing [12]. Query-focused training let
query representations improve with experience and prepare them to match new similar documents better,
and normally associated with relevance feedback [13). Document-focused training let document
representations improve with experience and prepare them to match new similar queries better, and
normally associated with dynamic document space modification [13]. They provide a precision enhancing
tool because term weights are rendered `sharper' towards relevant items. Our network implementation
differs from the traditional approaches in that two sets of weights are associated with an item, e.g. Wik and
wkl. Wk,.= [OCRerr] are the observed properties and not modified, while w[OCRerr] embeds inference and adapts as
more evidence is gathered. Moreover, effects of training from both query-focused and document-focused
processes are combined into one RSV, [OCRerr] that has been shown to lead to cooperatively better ranking
results. These ideas were first introduced in a series of papers [14,15,1,2]. With learning capability, the
net becomes a two layer direct-connect artificial neural network in each direction.
tk
w
ak )<
q
a 0
A
A
U..
0
Fig.3: Query-Focused DTO
Learning
w
T
OTO
Learning
d
[1)
q
I a
w
ak
A
A
tk
w 0
ti
DTQ
Learning
&
w Expansion
¾½~lk
d1
w t]
I'
D 0 D
Fig.4: DTQ Learning with Expans on
T
An additional feature allowed in our feedback learning is query expansion. The idea is that terms that are
158