IRE Information Retrieval Experiment Simulation, and simulation experiments chapter Michael D. Heine Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Some previous work in simulation applied to information retrieval 193 month (from 1 to 100000), and item input rate per month (from 1 to 100000 also), to produce an estimated cost range of from $188 000 to $558 000 for one system, and from $166 000 to $551 000 for a second system.) Baker and Nance24 report on a study in which the `system' is defined more generally-so as to include both the users and the funders of the retrieval service-their point being that a more restricted view may lead to suboptimization. (To optimize in respect of system response-time alone, or in respect of unit retrieval cost alone, may be to ignore the costs (or disutility) to the user entailed in (a) noisy (low-Precision) search output, and (b) actual usage of the system, such costs being, possibly, the main causes of low system usage or a poor reputation of it amongst users. (For a related, sceptical viewpoint, see W. S. Cooper31.) Baker and Nance assume, accordingly, that the funding and operating of a system must be seen as being influenced by user costs and convenience, or utility. The relationships of interest to them are portrayed in two detailed diagrams, and a table of descriptive content. Although the first diagram is a general one, the second, and the table, are relevant to a system having the form of a university departmental library, i.e. to a highly specific system only. The model of the system that is given is moreoever only indicative and no tangible results of the study are given or appear to have been published since. Reilly's report26 is unusual in that he was concerned with a single user and a single type of service, an approach that the Swets model13' 14 can also be interpreted as embodying. Reilly's study assumes that a user estimates both the delivery time of a document from a document-delivery service, and the utility of the service to him prior to making a request from the service. The user's subsequent behaviour is then determined by the truth-values of the inequalities: estimated service time <[OCRerr] need time, and actual service time <[OCRerr] need time, the former being modified with each decision and system response. Since the estimated service time is not an observable in an operational system (though it could be in an experimental environment) the model may not be acceptable to those who insist that simulation should deal only in observables. But non-observables are perhaps acceptable if one can, by assuming their existence and properties, successfully predict observable outcomes using them-the proof of the pudding. Reilly's approach would seem to bring retrieval work closer to the point where user/service interaction is properly heeded and accounted for, as a basis for the fuller system definition needed for the efficient management of information services. The point is in fact made by Reilly (and is also implied in the Baker and Nance paper) that integration of models of different areas of information supply is essential although he d[OCRerr]es not attempt same. Three such areas or `levels' are singled out by him in this connection: computer processing centre activities, determination of user behaviour (his main concern), and the delivery of documents. A further point in common between Reilly's report and Baker and Nance's study is that `a library' should be treated as an information system. Although this has been a commonplace idea in US writings for many years (and is a basis of Salton's recent monograph) there is a still a regrettable reluctance in the UK to view libraries (document supply systems) in the same light as information retrieval services (document record supply systems), notwithstanding the common problems each has and the strong interactions that necessarily exist between them. Simulation studies, in offering an