IRE
Information Retrieval Experiment
The methodology of information retrieval experiment
chapter
Stephen E. Robertson
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
12 The methodology of information retrieval experiment
which may confuse the results-hence the idea of conducting experiments
under laboratory conditions, with all variables controlled as far as possible.
On the other hand, in order to answer questions which relate directly to real
problems in the design of retrieval systems, and to provide answers which
will apply in real situations, a test must be conducted in (as nearly as possible)
an operational environment.
The conflict between these two aims is a real and continuing one. As a
result, a whole spectrum of testing methods has been developed, ranging
from pure laboratory experiments to the study of real systems and users in
their operating environment. This spread of methods is reflected in the
various comments that follow on components of the `normal' retrieval test.
2.2 Components of the archetype
The system
That we need an information retrieval system in order to do a retrieval test
is not quite as obvious as might at first appear. If we go back to the question(s)
that gave rise to the test in the first place, they will very often revolve around
some particular component of the system: the index language, say, or the
indexing process, or the process of search formulation. Suppose, then, that
we are concerned with the indexing process. Do we need to even have a
searching process in order to do an experiment? Is there no way we could test
the alternative indexing processes directly, without doing any searching?
The short answer is: no, there is no satisfactory way. By and large it is not
possible (at present) to set up criteria for the indexing process which we can
be confident will relate to the overall performance of the information retrieval
system in the right way. From which it follows that, if we want to decide
between alternative indexing strategies for example, we must use these
strategies as part of a complete information retrieval system, and examine its
overall performance (with each of the alternatives) directly.
This is a severe constraint on any test: it is as if, lacking the necessary
theories of mechanics and the strength of materials, bridge designers had to
build numbers of test bridges to test each material and each structural
component which they might use in their designs. (The recent problems with
box-girder bridges suggest that the idea is not so far-fetched!)
The second, related problem area under this heading concerns the
definition of the boundaries of the system, particularly in connection with the
user. The usual view of the information retrieval system manager is that
the (narrow) system is that which is under his control; the user normally falls
outside this narrow definition. But there are two strong reasons for including
at least some of the cognitive processes that the user goes through in the
definition of the system for the purpose of the test:
(1) Some of the processes that users go through can be influenced by narrow-
system components such as the index language;
(2) In terms of the arguments immediately preceding, we do not know how
to relate narrow-system behaviour to wide-system performance.
Both points relate to the idea discussed above, that the function of an
information retrieval system is to help the user to satisfy his/her information