IRE Information Retrieval Experiment Laboratory tests: automatic systems chapter Robert N. Oddy Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Program corrcctness 173 guethodology is also a ruthless critic of ill-conceived or vacuous theories, of which I suspect Information Science could boast a number. I should draw attention at this point to the fact that there are some information scientists who do not view computer modelling as a helpful means to understanding, and regard it as somewhat unhealthy. Rosenberg67 writes: `If, as I believe, the nature of human information processing is fundamentally different from machine information processing, then the development of digital computer systems becomes an obstacle to the understanding of information and its use' (p.266). As with the more conventional automatic information retrieval laboratory testing, computer modelling has its difficulties. First, programs can become very complex, to the point when they are incomprehensible, as patches are made to take care of unwanted behaviour. Simplification is then necessary if the model is to have any scientific value, and this may be dependent upon having a new insight. Second, evaluation of the effectiveness of the retrieval program is, as usual in information retrieval, fraught with difficulty. If we substitute a complex notion of relevance for a simple one in the theory, then we may even deprive ourselves of the easy way out[OCRerr]evaluating the system on its own terms. Finally, computer models can be criticized in the same way as other laboratory systems, for unrealistic isolation of portions of the system 50 for study What role can computer modelling in the laboratory play in the development of information retrieval, and what is its relationship to laboratory systems derived from mathematical theories? Clearly, modelling offers us a mode of expression which is distinct from mathematics, and is capable of coping with situations which do not appear to lend themselves to mathematical treatment. Should we regard a model as a stop-gap: a means of getting our ideas in order so that a mathematical theory can be evolved? Or should we accept the belief expressed by Sloman68 that viewing complex phenomena as computational processes should supersede older paradigms, such as the paradigm which represents processes in terms of equations or correlations between numerical variables' (p.3)? I do not think we have cnough experience of computer models to know the answer. In the meantime, the two can fruitfully coexist. The mathematical theories that exist at present refer to limited regions in the information retrieval domain. Pehaps models can be developed to provide the appropriate contexts for the application of the theories. 9.7 Program correctness It is well known that a fault-free program is a rare thing. Some faults cause the program to break down or behave in an obviously erroneous way; others lurk in the program unnoticed for a considerable length of time. These latter faults are responsible for deviations from correct behaviour which are sufficiently small that the results still appear reasonable to the experimenter. How `small' that is depends upon the expectations of the experimenter. I shall not dwell on the obvious type of fault, but make a few comments about the more subtle ones. It is clear that an experimenter must make every effort to ensure that the results obtained from his computer programs can be relied