IRE
Information Retrieval Experiment
The methodology of information retrieval experiment
chapter
Stephen E. Robertson
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Components of the archetype 15
Measurement: costs and times
If the object of a test is to determine something about cost-effectiveness or
cost-benefit, then clearly we must measure costs or some related factor.
Generally speaking, one is not concerned with the overall costs of the entire
system, but with the costs of certain specific parts. Thus an operational sys-
tem manager might want to know what happens (to both costs and per-
formance) if a certain part of the system is changed. I argued above that for
effectiveness, one must treat the entire system as a whole. For costs, generally,
the opposite is true: that is, since costs are in a strict sense additive, it is easiest
and most sensible to cost only those parts of the system that may change.
This is not the place for an extensive discussion of how to go about costing
a system or parts of it. It may be helpful, however, to note the almost
universal use of the equation cost = time. If the difference in cost between
two systems depends only on a difference in the time spent on one particular
operation (say human time on indexing or machine time on searching), then
one can do the appropriate cost-effectiveness comparison without ever
bringing in explicit costs, simply regarding the time spent (by human or
machine) as equivalent to cost. This avoids many accounting problems, and
is normally the only method of including costs that is open to the laboratory
researcher.
Measurement: coverage and currency
One group of variables that may be measured in connection with a particular
information service consists of those which relate to the collection of
documents, or to the systems for selection and acquisition, rather than to the
system which retrieves from the collection. This group includes such variables
as coverage and obsolescence. A considerable amount of attention has been
devoted to these variables in the information science literature, under the
general heading of bibliometrics. This work is, by and large, outside the
scope of this book. However, one specific connection should be made.
One of the properties of a retrieval system which one might want to find
out from an experiment is recall, or the proportion of the relevant documents
in the collection that are retrieved. If coverage (for a particular user) is
defined as the proportion of the relevant documents in the universe that are
included in the collection, then it is clear that coverage (of the collection) and
recall (of the system) together determine how many relevant documents the
user sees, given how many there are in the universe. In other words, collection
properties and retrieval system properties interact.
A second area of interaction concerns currency. In an SDI service, for
example, the delay between a document being published and a user becoming
aware of it is determined both by the selection and acquisition system and by
the indexing and retrieval system. As these examples indicate, in the final
analysis the properties of a retrieval system should not be considered in
isolation from other aspects of the information service of which it is part.
Nleasurenient: explanatory variables
One may be concerned, especially in laboratory tests, with variables which
might explain or predict the final performance of the system. These variables