<DOC> 
<DOCNO> IRE </DOCNO>         
<TITLE> Information Retrieval Experiment </TITLE>         
<SUBTITLE> The Smart environment for retrieval system evaluation-advantages and problem areas </SUBTITLE>         
<TYPE> chapter </TYPE>         
<PAGE CHAPTER="15" NUMBER="326">                   
<AUTHOR1> Gerard Salton </AUTHOR1>  
<PUBLISHER> Butterworth & Company </PUBLISHER> 
<EDITOR1> Karen Sparck Jones </EDITOR1> 
<COPYRIGHT MTH="" DAY="" YEAR="1981" BY="Butterworth & Company">   
All rights reserved.  No part of this publication may be reproduced 
or transmitted in any form or by any means, including photocopying 
and recording, without the written permission of the copyright holder, 
application for which should be addressed to the Publishers.  Such 
written permission must also be obtained before any part of this 
publication is stored in a retrieval system of any nature. 
</COPYRIGHT> 
<BODY> 
326   The Smart environment for retrieval system evaluation

  In summary one concludes that many of the Smart procedures have
interesting theoretical properties, in addition to proving effective under
various experimental conditions. The intellectual framework under which
the Smart system operates makes it easy to add new procedures and to extend
operations in various directions. In quite a few cases it becomes possible to
prove the usefulness of the techniques formally as well as experimentally.
  It remains to examine the appropriateness of undertaking a long-term
project such as Smart in the retrieval area. This is done in the final section of
this report.


15.5 Concluding remarks

It is hardly necessary to point out that the Smart system design carries with
it great advantages if one aims at constructing a flexible environment for
retrieval system experimentation. Whereas in normal environments, it
becomes necessary to retool to begin each individual experiment, the Smart
system has made it possible to carry out hundreds of different experiments
without substantial overhead or expense in program modification or
collection preparation. Such a flexible environment is to some extent bigger
than the sum of its parts: after using the system for a while one sees things fall
into place often one can anticipate the evaluation results before actually
seeing them, and one obtains an intuitive feeling for the operations of a
retrieval system. It is then possible to obtain substantial returns from a
continuing experimental project, in return for the substantial investment
that is necessary in building and maintaining the system over many years.
  Normally, an experimental system is considered useful because the
experimental results can help confirm a variety of formal theories and
abstract models for a given process or system of procedures. The Smart
system experiments have in fact been initiated in an attempt to confirm a
variety of theories about the content analysis problem. When an experimental
system is sufliciently flexible it may also be useful in reverse . That is, the test
results can help in formulating theories, and formal proofs can sometimes be
generated to describe precisely the conditions under which a given
experimental process is expected to be useful. Formal results obtained after
the fact have thus helped in rendering the Smart test results plausible in areas
such as term frequency weighting, term precision weighting, document
clustering, and relevance feedback.
  In addition the Smart system results have led at least to a rethinking about,
and sometimes to actual modifications of existing retrieval procedures. Since
so many different methodologies were actually subjected to intensive tests in
areas such as document input, indexing, classification, document-query
comparison, output ranking and display, query reformulation, and so on, the
Smart system has something to say in most areas relating to information
system design. As a result selected methods that are easy to implement and
apparently most productive (term weighting, relevance feedback, etc.) have
in fact found their way into a number of operating environments.
  What about the drawbacks of a large and continuing experimental project?
Obviously one must be careful about the initial design and about the claims
one makes about the results. It is easy to go off on a tangent and to get stuck

</BODY>                  
</PAGE>                  
</DOC>