IRE Information Retrieval Experiment The Smart environment for retrieval system evaluation-advantages and problem areas chapter Gerard Salton Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 15 The Smart environment for retrieval system evaluation-advantages and problem areas Gerard Salton* The Smart environment provides a test-bed for implementing and evaluating d large number of different automatic search and retrieval processes. In this chapter, the basic parameters underlying the Smart system design are briefly outlined, and a comparison is made with the characteristics of more conventional retrieval systems. The principal lessons learned from the Smart experiments are described, and some of the methodological problems raised by the system design are outlined. Finally, some comments are included [OCRerr]bout the dis(Idvantages inherent in working in the laboratory, and the insight[OCRerr] th[OCRerr]t cm he gained in such a situation. 15.1 Retrieval system environment Automatic, or semi-automatic information search and retrieval systems have no'A been in existence for some twenty years. In the early years, only small collections; could be searched. and the search requests received from the user population would be accumulated for some period of time, or `batched' bct[OCRerr]rc actually being processed, with the result that several weeks would normally elapse before answers could be obtained to a given query. At the present time, the role and importance of information retrieval has greatly increased for two main reasons: the coverage of the searchable collections is now extensive and collection sizes may exceed several million documents, furthermore, the search results can now be obtained more or less instantaneously, using online procedures and computer terminal devices that provide interaction and communication between system and users. The large collection sizes make it plausible to the users that relevant information will in het be retrieved as a result of a search operation, and the probability of obtaining the search output without delay creates a substantial user demand for the retrieval services. It is not surprising in these circumstances that several million search requests are currently submitted each year to a variety of automatic retrieval services. * 1tii[OCRerr] [OCRerr]tud\; [OCRerr] .I[OCRerr] [OCRerr]upp[OCRerr])rtc[OCRerr] Ifl p r[OCRerr] by thc N [OCRerr]i&)fl[OCRerr]I S[OCRerr]jCfl[OCRerr]L [:Oliflddtjofl under gr[OCRerr]nt L)S1-77- ()4543 31b