IRE
Information Retrieval Experiment
An experiment: search strategy variations in SDI profiles
chapter
Lynn Evans
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Results 303
TABLE 14.5. Run S data[OCRerr]ign test for significant difference
Search Significantly better Not significantly
strategy than diferent from
Relevance Ri documents
TWC CG (X2=16.9,p<O.OO1)
(x2= 14.4,p<O.OO1)
(x2= 1O.O,p<O.O1)
([OCRerr]2=6.4,p<O.O2)
(x2= 12.l,p<0.00l)
(x2= l1.0,p<0.00l)
(x2 =9.0, p<0.0l)
(x2=8.l,p<0.0l)
[OCRerr]2 =49, p<0.05)
(x2= 18.2,p<0.001)
kx2=ll.0,p<0.00l)
(x2 = 7.2, p<0.0l)
(x2=6.4,p<0.02)
(x2=56[OCRerr]p<002)
(x2 =5.6,p<0.02
CT
CTW
CGW
GWC CG
CT
CGW
CRTW CT
CG
CTW CT
CG
CGW CG
CT
GTWC CT
CG
CG
CT
Relevance R 1/2 documents
TWC CT (x2=12.8,p<0.001)
GTWC (x2=9.8,p<0.0l)
CG kx2=7.2,p<0.01)
GWC CRTW (x2=9.8,p<0.0l)
GTWC ([OCRerr]2=8.9,p<0.Ol)
CT (x2=8.0,p<0.0l)
CG [OCRerr]2=6.4,p<0.02)
CRTW
CTW CT (x2=21.4[OCRerr]p<0001)
CG (x2=6.4,p<0.02)
CGW CT (x2=118[OCRerr]p<0001)
CG (x2=5.7,p<0.02)
GTWC
CG CT (x2=4.4,p<0.05)
CT
GTWC (x2=303) CRTW (x2=3.03)
GWC (x2=0.03)
CTW (x2=3.6), GTWC (x2=2.03)
CRTW (x2=2.03), TWC
GTWC (x2=003) CTW (x2=0)
CGW (x2=0) TWC, GWC
CGW kx2=0.03), GTWC (x2=0.03)
GWC, CRTW
GTWC (x2=0.23), CRTW, CTW
TWC, GWC, CRTW, CTW, CGW
CT(x2=1.6)
CG
CRTW (x2=3.76) CGW (x2=269)
CTW([OCRerr]2=1.8), GWC([OCRerr]2=0.02)
CTW (x2=376) CGW (x2=3.76)
TWC
CT (X2=1.09),CGW([OCRerr]2=1.09)
GTWC (x2=0.56), CG (x2=0.36)
CTW (X2=0.36),TWC
CGW [OCRerr]2=0.09), GTWC (x2=0.02)
TWC, GWC, CRTW
GTWC (x2= 1.8), TWC, GWC
CRTW, CTW
CT (x2 =0.8), CG (x2 =0)
CRTW, CTW, CGW
CRTW, GTWC
CRTW, GTWC
It might be argued that a more powerful test such as the Wilcoxon matched
pairs signed ranks test could have been used since the magnitude of the
difference in the normalized recall between pairs of search strategies was
known for all the queries. This was not pursued because of unease concerning
the validity and overall effect of those queries with very' few relevant
documents; as mentioned earlier, in the extreme case of a query with only 1
relevant document a recall ratio of 0/1 could so easily be 1/1, and vice versa.
Boolean comparison
The method used for this individual profile-by-profile comparison of all the
strategies was as follows: