IRE
Information Retrieval Experiment
Introduction
chapter
Karen Sparck Jones
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Introduction
Karen Sparck Jones
This book is about information retrieval experiment. Documentary infor-
mation Systems have changed in many ways in the last 20 years. The rapid
growth of specialized literature has encouraged an intellectual development,
post-coordination, and a technological development, the use of computers.
Many questions about effective methods of document identification, and
about efficient methods of document management, have naturally followed
These questions about the substantive and the economic aspects of retrieval
systems have provoked a whole range of studies. Some of these may be
described as investigations; others can properly be described as experiments.
They are sometimes associated with generalizations or sometimes, more
strongly, with theories or models.
These studies taken together have produced some moderately solid results,
and have to some extent enlarged our understanding of the way retrieval
systems work. For example it appears that good retrieval performance lies in
the 4060 per cent recall and precision area; and it seems that some
probabilistic models offer valuable insights into system behaviour. But
research progress has perhaps been less than might have been expected, and
information system practice has in essentials been extremely conservative.
The 1958 International Conference on Scientific Information, held in
Washington, was widely felt to mark the beginning of a new era in
information processing. The novel ideas and techniques to be developed
were symbolized by the `auto-abstracts' of conference papers produced by
Luhn. By 1978 computers were firmly established in information work, but
primarily, and almost entirely, for clerical operations: the crucial information
processes of document and request characterization and matching are done
by human beings along conventional lines.
There is no very good reason to suppose that the conventional methods are
best, even in principle, let alone practice. There is in particular no good
reason to believe that they are based on any thorough understanding of the
nature of information systems. Information systems are manifestly complex.
It is by no means certain that information science exists: but it is clear that
information processes need scientific study. The requirement for, and role of,
experiments in such a study is clear. A good deal of experimental and
investigative work has been done in the last two decades; but while results
1