IRE
Information Retrieval Experiment
Laboratory tests: automatic systems
chapter
Robert N. Oddy
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Laboratory experiment in information retrieval 157
Information retrieval is the instrumentalist wing of information science.
Most laboratory work has been explicitly directed toward the establishment
[OCRerr])I' system design principles. Where information retrieval tests have been
effected automatically, the theories under test have almost always been
prescriptive. This should not be taken for a truism: if a theory is tested by
means of a computer program, or other machine, it does not follow that it is
%`imply prescriptive. It is not in practice always true, and certainly not
necessarily true that a program, constructed as laboratory apparatus, is in
%.`ome sense a prototype for a real life system. It is quite possible for a program
to act, primarily, as a formalism, or detailed interpretation of, say, a
(lescriptive theory. In fact, in the artificial intelligence field, this is frequently
the intention of the programmer1. However, within the mainstream of
research on automated information retrieval, it happens to be the case that
theories have been predominantly prescriptive, and laboratory systems have
been put up as potential prototypes. Perhaps it would be realistic to view
these computer test environments rather as engineering workshops than as
laboratories.
Topics that have been investigated in computer laboratories include
classification of index terms2, document clustering35, automatic indexing
and term weighting6' 7, relevance feedback8-10, vector space models8' [OCRerr] and
probabilistic theories12' 13 The usual way of testing the ideas has been to
evaluate the ability of retrieval programs based upon them to separate
relevant from non-relevant documents. Many aspects of the experimental
methodologies used in this type of work derive from those developed by
Cleverdon, particularly in the second Cranfield project14
The research methodology which has dominated laboratory work on
automated information retrieval can be summarized by Figure 9.1. (There
are several obvious feedback loops which I have omitted from the picture.)
Empirical knowledge (about indexing languages, for instance), combined
with the researcher's own intuition, lead him to state some assumptions
about the inputs to, and objectives of an information retrieval system. From
these he will attempt to derive a system specification perhaps by means of a
structure of mathematical deductions. Thus, he can build computer programs
which create and organize collections of document descriptions, retrieve
references in response to compatibly formulated queries, and monitor their
own activities for evaluation purposes. The evaluation uses test data, that is
documents and queries chosen by the experimenter and with known
characteristics; and it is normally the performance of the system with respect
to the objectives that is evaluated, and not the plausibility of the assumptions
or theory directly. Over the years the amount of rigour and effort allotted to
the various components of the methodology have fluctuated. For example, as
experience has been gained with certain classes of retrieval test system,
programs have been assembled into flexible packages, so that new
mechanisms can more easily be built from the components of old. Thus, the
effort required in implementing programs has declined. At the same time,
there is a new wave of mathematics in information retrieval research15' 16
The assumptions are stated more rigorously than before, and the theoretical
development prior to system construction has become a focus of
attention5 13, 17, 18
Another discernible variation in methodology, with time, is that the