IRE Information Retrieval Experiment An experiment: search strategy variations in SDI profiles chapter Lynn Evans Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 308 An experiment: search strategy variations in SDI profiles overall most cost-effective search strategy. The discussion on p.307 suggi[OCRerr]l[OCRerr] that the computer search costs for all search strategies except CRTW w([OCRerr][OCRerr]Il'l not differ by more than 20 per cent because they all use the same basic sci t)f search terms. However strategy CRTW, which by definition always Cont[OCRerr][OCRerr]i:[OCRerr][OCRerr] 20 search terms compared with an average of 46 terms for the other strategi(.%, is thereby much more economic in terms of computer search costs. If it[OCRerr] retrieval performance is on a par with the other strategies then stratc[OCRerr]y CRTW must be the most cost-effective of all the 10 strategies evaluate[OCRerr]1 it the main experiment. The results presented on pp.299 et seq. show that t1ii[OCRerr] is in fact the case. A more modest possibility with the available information is a compari[OCRerr]tttt of the cost-effectiveness of all the search strategies in terms of informatiolt scientist cost only. A suitable measure of effectiveness is `relevant documeiit[OCRerr] retrieved' and, certainly in the SDI situation, the number of relev[OCRerr]iii documents retrieved at cutoff points of, say, 15 or 25 documents would seciti to be the most appropriate. The basic `effectiveness' data included in the original report show, fol example, that at a cutoff point of 15 items, strategy CT retrieved 164, 127 alit 120 relevance Ri documents, respectively, on runs 1 (46 queries), 5 (4[OCRerr] queries) and 6 (46 queries). This gives an average figure t'I (164+127+ 120)/(46+45+46), i.e. 3.0 relevance Ri documents retrievc(f per query per run by strategy CT at a cutoff point of 15 items. Assuming So runs per year (weekly SDI service) this figure becomes 150 relevance R I documents retrieved per query per year. Dividing this number by the ont. given for information scientist effort on strategy CT in Table 14.7 gives figure for the cost-effectiveness of strategy CT at a cutoff point of 15 items, i.e. 150/1.5 or 100 relevance Ri documents retrieved per year per hour [OCRerr]tI information scientist effort. Repeating the calculation for all the strategies ii cutoff points of 15 and 25 items gives the comparative cost-effectiveness figures shown in Table 14.8. TABLE 14.8. Cost-effectiveness of search strategies (information scientist effort only) Order Strateg[OCRerr][OCRerr] cost-eftectiteness (re/et[OCRerr]ont documents retrieted/yeor/hour of information of scientist time) merit Relet[OCRerr]ance RI documents Re1et[OCRerr]once RI/2 documents Cutoft[OCRerr]I5 Cutoft25 Cutoft[OCRerr]I5 Cutoft25 CRTW (108.8) CRTW (150.9) CRTW (251.1) CRTW (355.0) 2 CT (100.0) CT (139.4) CT (233.6) CT (347.0) 3 TWC (98.2) TWC (136.0) TWC (211.6) TWC (311.4) 4 CTW (91.9) CTW (124.7) CTW (204.5) CTW (303.3) 5 CG (89.0) CCI (121.4) CCI (202.8) CCI (294.1) 6 GTWC (83.3) CITWC(114.8) CCIW (180.8) CGW (264.9) 7 CGW (81.8) CIWC (109.5) GWC (175.5) GWC (251.9) 8 GWC (81.3) CGW (108.3) GTWC(169.9) GTWC(245.5) 9 BW (62.8) BW (77.8) BW (128.0) BW (167.9) Notes. Ii) The information scientist effort for strategy CRTW was assumed equal to thut br strutegy Cl (seep. 291) (ii) The figures for hoolean strategy BW are not strictly comparable with the others since some of the hooleun outputs were less than 15/25 items Strategy B was omitted