IRS13 Scientific Report No. IRS-13 Information Storage and Retrieval Evaluation Parameters chapter E. M. Keen Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. II-' II Evaluation Parameters E. M. Keen 1. Introduction Evaluation of the SMART system centers on techniques for the measure- ment of retrieval performance. Some reasons for concentrating on this evalua- tion criterion have been given in [1]. This section discusses many aspects of retrieval performance measurement in general, describes several of the measures used by SMART, and gives a detailed account of the way in which results of individual requests are processed in order to present averaged results. Several measures professed by other researchers in the area are examined and evaluated, and some considerations relating to future testing are made. 2. Purposes, Viewpoints and Properties of ?erformance Measures Since performance measures are used for different purposes according to test objectives, a division into three types is suggested. Firstly, there is the need for measures with which to make merit compari[OCRerr]ons within a single test situation, that is, to make `internal' comparisons only. In tests of this type the document collections, search requests, and relevance decisions are held constant while some system variable is altered, and this procedure has been used for almost all of the SMART experiments. Such situations are best characterized, in terms of performance measurement, by saying that comparisons are made in situations of constant generality, and a "generality nunber" may be computed in such cases [2]: Total Relevant in Collection x 1000 G = Total Documents in Coliection