IRE Information Retrieval Experiment Simulation, and simulation experiments chapter Michael D. Heine Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. r[OCRerr]I 192 Simulation, and simulation experiments suggested limitations on the validity of a simulation study when very severely limiting definitions are used. The third example showed how purely formal constructions can usefully be discussed and compared in a particular context using familiar information retrieval concepts, with no additional definition and dealing only in observables. 10.3 Some previous work in simulation applied to informafion retrieval For reasons given in the introduction it is in principle impossible to delimit the literature on simulation applied to information retrieval in a satisfactory way. The modelling or representational element, so essential to simulation, is often present in general discussions on retrieval that do not specifically refer to simulation by name. Writers will frequently have used the `language' of simulation without necessarily having used simulation techniques in the narrower sense for exploring the relationships that they discuss, or without having been particularly concerned with optimization of, or intervention in, the process described. We shall adopt as our rather arbitrary criterion for inclusion that systems be formally represented and that relationships between system components be systematically explored using plausible or experimen- tally-obtained values for the variables involved, with the methodological emphasis on the manipulation of such data. The works meeting this criterion appear to be few in number20-30. Gurk's paper20 is more an indicative description of a prototype of an information retrieval system than a description of a simulation of it. Useful comment directed at simulation work in the general information retrieval context has been offered by Chapman1 and Salton1 2, the latter's monograph reviewing the main models. The paper by Bourne and Ford22 is concerned with the economics of information retrieval systems. The objective was to estimate the operating cost, and the amounts of equipment and personnel, needed over a given time- period by several hardware information retrieval configurations. Their paper makes the point that on the basis of known data, and a knowledge of the gross characteristics of a proposed system, the costs that would be borne in the future can be arrived at by solely manipulative means much more cheaply than by actually building and testing the system, thus underlining one of the basic reasons for undertaking simulation studies. The `known data' is grouped by them under three headings: `Time and Cost Data' (wage rates, costs of materials, equipment purchase and maintenance costs, stationery, etc.), `Statements of Interrelationships' (e.g. item input rate per person, search time per request), and `Constants' (e.g. amortization period of purchased equipment, interest rate on borrowed capital). Bourne and Ford comment appropriately that the credibility of their type of analysis depends upon the accuracy and completeness of both the analysis of the proposed system and the basic time and cost data, but perhaps they do not sufficiently emphasize the vulnerability of such analyses to rapid technological obsolescence. A further useful point brought out by them is that the sensitivity of operating costs (say) or other measures of efficiency or effectiveness, can be explored in a simulation study. (They quote data for annual expenditure as a function of the two independent variables: number of searches per