IRE
Information Retrieval Experiment
Retrieval system tests 1958-1978
chapter
Karen Sparck Jones
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
r
The current state of retrieval system understanding 249
operational Systems of the mid-1970s. Assuming four groups of system
components, relating to input, store, search, and overall organization, he
notes that, with respect to the store, the important breakthrough was the
National Library of Medicine's use of computers for the preparation of
printed indexes; with respect to input, the key factor was the boost given to
post-coordination by Taube's company, Documentation Inc.; with respect to
searching, the vital development was the growth of online computing; and
with respect to overall system organization, the significant contribution was
made by computer network technology. So, Cleverdon concludes,
`We now have mechanised systems which not only allow the user to do
everything which was possible with a card catalogue or printed index, but
also give him many additional facilities. We can search in natural
language[OCRerr]r a controlled vocabulary if we still cling to the old beliefs.
There is the power and flexibility of postcoordinate searching, output can
be automatically printed in a number of different forms, and the citations
can be in a ranked order of probable interest. There are, of course, some
corresponding disadvantages. Many people would consider online searches
to be expensive while others find the systems awkward and complex to use.
Both these are aspects that can only change for the better.'
According to Cleverdon, the contributions to retrieval system development
made by testing have been very limited. He singles out as critical the 1953
Documentation Inc. comparison between uniterm and alphabetical in-
dexes1 24, and Swanson's 1962 comparison between simple automatic text
searching and conventional manual indexing1 25 Though the results obtained
were not properly understood at the time, Cleverdon argues that the common
factor explaining the comparative success of the Uniterms in Taube's test
and of the text indexing in Swanson's was the use of natural language.
Subsequent tests have like Cranfield 2 (Refs. 2, 3) confirmed the value of
natural language indexing. For Cleverdon other important tests, in terms
both of their individual results and the natural way in which these results
could be combined for whole system characterization, were Cranfield l[OCRerr]
(Ref. 6), which demonstrated the inverse relation between recall and
precision, and the sequence of Smart tests exhibiting the value of direct text
utilization as a mode of natural language indexing, of matching producing
ranked output, and of iterative searching. Cleverdon finds that by the late
1960s, `it was obvious that we had acquired the knowledge that would enable
mechanised systems to be designed that were both effective and economic.
Subsequent developments in computing technology permitted us to take
advantage of this knowledge. In Cleverdon's opinion, `it is clear that no
single research investigation made a major contribution to the present
position, and that most of the significant advances have come as a result of
setting up operational systems, from which developments flowed'. In fact,
even the widespread, though by no means exclusive, use of natural language
in operational systems may be as attributable to practical factors as to the
application of research findings; and the systematic exploitation of the
recall/precision relationship, of ranking, and of coherent interactive
procedures, do not in fact figure in operational systems. The natural language
text available for searching also tends to be limited.
Cleverdon's position, however, is that while he is enthusiastic about the