Information Retrieval Experiment

IRE Information Retrieval Experiment Laboratory tests: automatic systems chapter Robert N. Oddy Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 174 Laboratory tests: automatic systems upon. In recent years a considerable amount of research has been done on the problems of reliability in computer Systems in general. A very useful collection of papers is to be found in Anderson and Randell69. Among the techniques for improving program reliability, those for fault avoidance, as opposed to tolerance, and those not requiring a special programming system are of most interest to the experimental information retrieval programmer. Disciplined use of a good programming methodology and a language which allows for a reasonably natural expression of the process and data structures are the most significant techniques. Correctness of a program is judged with respect to a specification of the required function and behaviour of the program. The judgement can be made in two ways: (1) The program can be tested: Gerhart70 discusses the principles of testing. A program must always be tested. The testing of an information retrieval laboratory program is quite conventional, although constructing test data can be tedious. However, testing can never be exhaustive. One may try to test program modules exhaustively, but modules have a habit of interacting with each other in unexpected ways when they are run together, so ultimately one is faced with the prospect of checking every conceivable output of the complete program One compensates for the inevitably partial testing by constantly keeping an eye on the reasonable- ness of all output produced by the program, and by combining testing with the second method of judging correctness: reasoning about the program. (2) The program may be proved to be consistent with the specification. This is extremely difficult and the proofs tend to be unwieldy, but very informal proofs can often be done for parts of the program, if it is well structured, which are convincing enough for most purposes. With informa[OCRerr]on retneval programs, we must be clear what we mean b[OCRerr] the specification. I am concerned at the moment with the technical problen' of obtaining a correct program, and not the research problem of obtaining at ideal program design. Thus correctness is to be judged against the researcher' design, rather than against the system user's requirement. For this concep of correctness to have a straightforward meaning, the semantics of th researcher's system specification must be quite clear, that is it must b possible to express it formally. There is no problem, in principle, if th system is a consequence of a mathematical theory. If, however, the prograr is the model, in the sense discussed in the previous section, then we ar reduced to talking about validating the program against [OCRerr] The research[OCRerr] will probably have a detailed description of the program (perhaps similar appearance to the extract given above of the description of Thomas), but th is not formal. The meaning of the description is worked out in the prograTi It is therefore possible that the researcher will be experimenting with a mod which differs from his original intention in some unknown way, and whi( he does not fully understand. (He will have the program text, in some fort but it does not follow that he understands the model.) Of course, mai programs which have independent formal specifications for part of them al incorporate heuristics, and that fact, strictly speaking, puts them into t same category. The researcher would normally make the assumption that