IRE
Information Retrieval Experiment
Retrieval system tests 1958-1978
chapter
Karen Sparck Jones
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
1'
218 Retrieval system tests 1958-1978
inevitable. In particular, it is virtually impossible to apply any rigorous
definition of a `unit' test or `unit' experiment as, say, an explicit comparison
between two values of a primary variable, with all other variables held
constant, or perhaps a comparison between two values of a primary variable
for two of some secondary variable. This is to some extent because definitions
would lead to intolerable detail, but also because much reported work is
rather difficult to characterize consistently at this level: this in turn is partly
because, as noted above, retrieval system behaviour is not well characterized
in terms of its components. Some large or continuing projects can indeed be
described as conducting series of tests. But in general, an individual test will
be taken, informally, as whatever the authors of a paper regard as a test,
which is chiefly a matter of objectives. This has the advantage of matching
the authors' own views of tests in terms of their primary variables, but the
disadvantage of failing to take full account of the information embodied in
multi-variable tests. That is, where authors are interested in the behaviour of
a primary variable subject to the variation of one or more secondary
variables, we may turn the test upside down and view the secondary variables
as primary. However attempting to examine the mass of tests done from all
points of view would be impossible, so, though some alternative views will be
noted, these will be rather limited, and will be mainly those recognized by the
research workers responsible for large, multi-variable tests.
12.3 The decade 1958-1968
The year 1958 is a natural starting point for the historical account. The 1958
Washington International Conference on Scientific Information was widely
felt to mark new developments in documentation and information retrieval,
specifically the appearance of a new intellectual tool, post-coordination, and
a new physical tool, the computer. Luhn's auto-abstracts of conference papers
may be taken as a symbol of the possibilities then perceived for automatic
information processing. Research work in the following decade, and
especially in the earlier part of the 1 960s, was dominated by studies
comparing newer post-coordinate indexing, perhaps involving a thesaurus,
with older classificatory approaches. The expansion of computing was
associated on the one hand with research on fully automatic indexing and
searching systems, and on the other with work on automated searching. As
had already been demonstrated by the use of punched card machines, post-
coordination was especially suited to automation, and formed the basis of
studies of automatic indexing and searching. Research on statistically-based
indexing, stimulated by Luhn, was especially prominent in the early 1960s.
It was soon recognized that identifying indexing keys by direct automatic
content analysis was not a realistic shorter-term aim, and statistical
techniques for extracting information about words and word relations were
proposed as substitutes. There was considerable enthusiasm for automation,
and optimism about its potentialities, reflected in the effort devoted to
machine translation. The hardware and software limitations of the machines
available nevertheless made research into all kinds of automatic information
processing methods very difficult.
Post-coordination and automation were essentially responses to the