IRE Information Retrieval Experiment Evaluation within the enviornment of an operating information service chapter F. Wilfrid Lancaster Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 6 Evaluation within the environment of an operating information service F. Wilfrid Lancaster This chapter deals with the problems of evaluating operating information services, the objective of such evaluation being to determine how successful the service is in satisfying the needs of its users. The major emphasis of the chapter is the `human aspects' of information services, machine aspects being dealt with in the chapter by Barraclough. The chapter first introduces some basic concepts and definitions related to evaluation. The special problems involved in applying evaluation methods to operating services are then discussed. There exist a number of possible reasons why the managers of an information centre may wish to conduct an evaluation of the services provided. One is simply to establish a type of `benchmark' to show at what level of performance the service is now operating. If changes are subsequently made to the services, the effects can then be measured against the benchmark previously established. A second, and probably less common, reason is to compare the performance of several information centres or services. Since a valid comparison of this type implies the use of an identical evaluation standard, the number of possible applications of this kind of study tend to be quite limited. Examples include studies of the coverage of different data bases (e.g. Ashmole et al.1, Davison and Matthews2), the comparative evaluation of the document delivery capabilities of several libraries (Orr ci al.3), and the use of a standard set of questions to compare the performance of question-answering services (Crowley and Childers4). A third reason for evaluation of an operating information service is simply to justify its existence. A justification study is really an analysis of the benefits of the service or an analysis of the relationship between its benefits and its cost. The fourth reason for evaluation is to identify possible sources of failure or inefficiency in the service, with a view to raising the level of performance at some future date. Using an analogy with the field of medicine, this type of evaluation can be regarded as diagnostic and therapeutic. In some ways it is the most important type. Evaluation of an information service is a sterile exercise unless conducted with the specific objective of identifying means of improving its performance. Evaluation of an information service may be subjective, based solely on the opinions of its users, or it may be objective. Subjective, opinion-based studies 105