SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Bayesian Inference with Node Aggregation for Information Retrieval
chapter
B. Del Favero
R. Fung
National Institute of Standards and Technology
D. K. Harman
Bayesian Inference with Node Aggregation for
Information Retrieval
Brendan Del Favero
Robert Fung
Institute for Decision Systems Research
350 Cambridge Avenue, Suite 380
Palo Alto, CA 94306
idsr @ netcom.com
1 Introduction
Information retrieval can be viewed as an evidential
reasoning problem. Given a representation of a document
(e.g., the presence or absence of selected words and
phrases), and a representation of an information need (e.g.,
topics of interest), the problem of information retrieval is to
infer the degree to which the document matches the
information need. Since probability theory is the classical
choice for automating evidential reasoning, probabilistic
approaches to information retrieval are natural and have
had a long history, starting in the 1960's (Maron & Kuhns,
1960).
In this paper we describe research that adapts and applies
Bayesian networks, a new technology for probabilistic
representation and inference, to information retrieval. The
technology has substantial advantages over older
technologies including an intuitive representation and a set
of efficient inference algorithms. We discuss the Bayesian
network technology and probabilistic information retrieval
in Section 2 of this paper.
Our research is directed at developing a probabilistic
information retrieval architecture that:
* is oriented towards assisting users that have stable
information needs in routing (i.e., sorting through)
large amounts of time-sensitive material,
* gives users an intuitive language with which to specify
their information needs,
* requires modest computational resources (i.e., memory
and CPU speed), and
* can integrate relevance feedback and training data with
users' judgements to incrementally improve retrieval
performance.
Towards these goals, we have developed a system that
allows a user to specify: multiple topics of interest (i.e.,
information needs), qualitative and quantitative
relationships between the topics, document features that
relate to the topics, and quantitative relationships b[OCRerr]tween
these features and the topics. The system runs on a
Macintosh II computer and can use training data to estimate
any of the quantitative values in the system. We discuss the
particular methods we developed and used in our system in
Section 3.
151
We participated in the exploratory group (Category B) of
the 1993 Text Retrieval Conference (TREC-2), sponsored
by the National Institute of Standards and Technology
(MST). As a participant in the exploratory group, we were
tasked with working with a subset of the TREC-2 training
and test data. Our training data consisted of Wall Street
Journal (WSJ) articles and our test data consisted of San
Jose Mercury News (SJMN) articles. We chose a subset of
10 topics out of the 50 TREC-2 routing topics to best
illustrate the methods and concepts we developed. The
choice of the 10 topics was reported to the TREC
coordinators prior to our training runs and, of course, prior
to our receipt of the test data. We generated routing queries
for each of the 10 chosen topics, trained against the WSJ
training set to improve our queries, and tested these queries
against the SJMN articles in the test data set.
Our system was developed entirely within the duration of
the TREC-2 project (January 93 to June 93) including the
document handling, feature extraction, inference, and
reporting capabilities. Our TREC-2 effort consisted of the
two authors. We describe the experimental set-up in
Section 4 and the result of our test run in Section 5.
We are very encouraged by the test results and have many
ideas for future research, which we discuss in Section 6.
2 Background
In this section we describe the Bayesian network
technology and outline the previous efforts in probabilistic
information retrieval.
2.1 Bayesian Networks
While probability theory provides a suitable theoretical
foundation for evidential reasoning, a technology based on
probability theory that is computationally tractable and that
includes an effective methodology for acquiring the needed
probabilistic information has been lacking. Recent
developments in Bayesian networks have provided these
features. As the name suggests, the technology is based on
a network representation of probabilistic information
(Howard & Matheson, 1981; Pearl, 1988).
A Bayesian network represents beliefs and knowledge
about a particular class of situations. The use of Bayesian