IRE
Information Retrieval Experiment
An experiment: search strategy variations in SDI profiles
chapter
Lynn Evans
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
294 An experiment: search strategy variations in SDI profiles
Tahie 14.2 shows that, in terms of information scientist effort, the simplest
strategy, CT, takes almost exactly half as long to compile as the most complex
strategy, BW.
Modification tinics
As mentioned on p.287 above the profiles were analysed and (perhaps)
modified Just once, viz. after completion of the first group of 4 SDI runs but
before starting the second group.
The initial analysis procedure adopted was standard for all profiles
irrespective of whether any relevance assessments had been received from
the users. Ten minutes were taken for an examination of the profile
performance after which time a decision was taken as to whether or not any
basic modifications were necessary. Twenty-two users' profiles were in fact
amended. It is emphasized that the time taken for any particular modification
is assigned in full to all the search strategy variations incorporating that
modification, e.g. if 20 mm are spent on amending the boolean equations
then this time is allocated to both strategies B and BW.
Averaging the modification time data (including all profiles whether
modified or not) and adding the 10 mm initial analysis time the average
strategy modification times obtained were: CT = 13, CG = 13, CTW = 14,
CGW= 14, TWC= 15, GWC= 15, GTWC= 15, B=21, and BW=22 mm.
Discussion
Before leaving the profile compilation procedure it may be useful to discuss
the standard tasks in more detail in particular to consider some of the
conflicts that occurred in trying to achieve a balance between experimental
rigour and what common sense indicated should be done in a real situation.
It has already been stated that for a valid comparison of search strategies
it seemed essential that, for a particular user statement, the same basic set of
search terms should be used. In fact occasions arose when this was contrary
to the needs of particular strategies, e.g. in the use of negative weights, NOT
logic, and WITHIN logic, which facilities do not feature sensibly in the co-
ordination strategies CT, CG, CTW, CGW and CRTW. Fxamples of the
use of these facilities are detailed in the original report and the extent to
which they were used is indicated by the fact that, of the 55 statements
received, negative weights were included for 10 users, NOT logic was
included for 8 users, and WITHIN logic was included for 2 users.
Another general problem occurs when the original user statement really
covers more than one basic subject interest or question. With boolean
strategies, if nesting or sublogic facilities were available, there would be no
problem but with co-ordination strategies it seems nonsensical to mix search
terms from what are essentially different questions. In those cases where
obviously more than one subject interest was involved the user statement was
divided and treated as 2 (and once 3) completely separate questions. It is now
felt that this should have been done for more of the user statements than was
in fact the case, viz. 6 users.
Some specific problems encountered when executing the individual
standard tasks were