SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Workshop on: Machine Learning and Relevance Feedback
report of discussion group
Norbert Fuhr
Stephen Robertson
National Institute of Standards and Technology
Donna K. Harman
first. The level of abstraction used in the definition of the features plays an important role for
the applicability or non-applicability of certain learning methods, since most methods require
a certain amount of data. In general, a higher level of abstraction yields more leaning data;
on the other hand, the decision resulting from the learning algorithm may become too
unspecific. For this reason, there is a need for different levels of abstraction, from which one
may choose the most appropriate one for the actual circumstances. However, for text there
are effectively only two levels of abstraction, namely either the term itself or its statistical
parameters (e.g. within-document-frequency and inverse document frequency). A possible
intermediate level would be sets of synonym terms.
Jn order to improve the effectiveness of ML methods for a given learning sample, prior
knowledge plays an important role. For example, we may assume that the weight of a term
with respect to a document is a monotonic function of its within-document-frequency. For
this reason, a regression method which is implicitly based on this tppe of prior knowledge is
more appropriate than e.g. a classification tree which makes no such assumption.
We may distinguish different sources of learning data. Most important, we have relevance
feedback data. Here one may think of different levels of response, either by using mul-
tivalued relevance scales or by indicating important paragraphs within a long document. For
some applications, it may also be necessary to get more specific feedback with respect to
internal decisions of the system. For example, for tuning a phrase detection algorithm, it
would be useful to get decisions about each specific phrase. As a third possible source of
learning data, the combination of different sources of knowledge (e.g. thesauri and corpora)
also might yield new information.
For the TREC initiative, there are two possible improvements which would ease the
further application of ML methods. First, relevance feedback data should be enriched by mdi-
cating the most important paragraph of the document. As a precondition, there should some
method for identifying single paragraphs, e.g. by additional SGML-like tags. Since the asses-
sors will not give the requested type of judgements, the TREC participants would have to do
this job, and NIST should act as collector and distributor for these judgements.
370