IRE
Information Retrieval Experiment
Laboratory tests: automatic systems
chapter
Robert N. Oddy
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Program corrcctness 173
guethodology is also a ruthless critic of ill-conceived or vacuous theories, of
which I suspect Information Science could boast a number.
I should draw attention at this point to the fact that there are some
information scientists who do not view computer modelling as a helpful
means to understanding, and regard it as somewhat unhealthy. Rosenberg67
writes: `If, as I believe, the nature of human information processing is
fundamentally different from machine information processing, then the
development of digital computer systems becomes an obstacle to the
understanding of information and its use' (p.266).
As with the more conventional automatic information retrieval laboratory
testing, computer modelling has its difficulties. First, programs can become
very complex, to the point when they are incomprehensible, as patches are
made to take care of unwanted behaviour. Simplification is then necessary if
the model is to have any scientific value, and this may be dependent upon
having a new insight. Second, evaluation of the effectiveness of the retrieval
program is, as usual in information retrieval, fraught with difficulty. If we
substitute a complex notion of relevance for a simple one in the theory, then
we may even deprive ourselves of the easy way out[OCRerr]evaluating the system on
its own terms. Finally, computer models can be criticized in the same way as
other laboratory systems, for unrealistic isolation of portions of the system
50
for study
What role can computer modelling in the laboratory play in the
development of information retrieval, and what is its relationship to
laboratory systems derived from mathematical theories? Clearly, modelling
offers us a mode of expression which is distinct from mathematics, and is
capable of coping with situations which do not appear to lend themselves to
mathematical treatment. Should we regard a model as a stop-gap: a means
of getting our ideas in order so that a mathematical theory can be evolved?
Or should we accept the belief expressed by Sloman68 that viewing complex
phenomena as computational processes should supersede older paradigms,
such as the paradigm which represents processes in terms of equations or
correlations between numerical variables' (p.3)? I do not think we have
cnough experience of computer models to know the answer. In the meantime,
the two can fruitfully coexist. The mathematical theories that exist at present
refer to limited regions in the information retrieval domain. Pehaps models
can be developed to provide the appropriate contexts for the application of
the theories.
9.7 Program correctness
It is well known that a fault-free program is a rare thing. Some faults cause
the program to break down or behave in an obviously erroneous way; others
lurk in the program unnoticed for a considerable length of time. These latter
faults are responsible for deviations from correct behaviour which are
sufficiently small that the results still appear reasonable to the experimenter.
How `small' that is depends upon the expectations of the experimenter. I
shall not dwell on the obvious type of fault, but make a few comments about
the more subtle ones. It is clear that an experimenter must make every effort
to ensure that the results obtained from his computer programs can be relied