IRE
Information Retrieval Experiment
The Smart environment for retrieval system evaluation-advantages and problem areas
chapter
Gerard Salton
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
15
The Smart environment for retrieval
system evaluation-advantages and
problem areas
Gerard Salton*
The Smart environment provides a test-bed for implementing and evaluating
d large number of different automatic search and retrieval processes. In this
chapter, the basic parameters underlying the Smart system design are briefly
outlined, and a comparison is made with the characteristics of more
conventional retrieval systems. The principal lessons learned from the Smart
experiments are described, and some of the methodological problems raised
by the system design are outlined. Finally, some comments are included
[OCRerr]bout the dis(Idvantages inherent in working in the laboratory, and the
insight[OCRerr] th[OCRerr]t cm he gained in such a situation.
15.1 Retrieval system environment
Automatic, or semi-automatic information search and retrieval systems have
no'A been in existence for some twenty years. In the early years, only small
collections; could be searched. and the search requests received from the user
population would be accumulated for some period of time, or `batched'
bct[OCRerr]rc actually being processed, with the result that several weeks would
normally elapse before answers could be obtained to a given query.
At the present time, the role and importance of information retrieval has
greatly increased for two main reasons: the coverage of the searchable
collections is now extensive and collection sizes may exceed several million
documents, furthermore, the search results can now be obtained more or less
instantaneously, using online procedures and computer terminal devices that
provide interaction and communication between system and users. The large
collection sizes make it plausible to the users that relevant information will
in het be retrieved as a result of a search operation, and the probability of
obtaining the search output without delay creates a substantial user demand
for the retrieval services. It is not surprising in these circumstances that
several million search requests are currently submitted each year to a variety
of automatic retrieval services.
* 1tii[OCRerr] [OCRerr]tud\; [OCRerr] .I[OCRerr] [OCRerr]upp[OCRerr])rtc[OCRerr] Ifl p r[OCRerr] by thc N [OCRerr]i&)fl[OCRerr]I S[OCRerr]jCfl[OCRerr]L [:Oliflddtjofl under gr[OCRerr]nt L)S1-77-
()4543
31b