IRE
Information Retrieval Experiment
An experiment: search strategy variations in SDI profiles
chapter
Lynn Evans
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Results 299
logical relevance and utility in an information retrieval context before
arguing in favour of the latter as the better basis for a measure of retrieval
effectiveness111 2 The point was succinctly made thus-'the purpose of
retrieval systems is (or at least should be) to retrieve documents that are
useful, not merely relevant'. Usefulness is defined as the user's subjective
evaluation of the personal utility of a retrieval system's output to him.
Recognizing the difficulties of such a subjective evaluation it is suggested
that more efficient compromise measures may be feasible although no ready-
made solutions have been presented.
The subject still features in the literature13 and on a practical level a real
distinction certainly exists between relevance meaning `aboutness' and
relevance meaning `pertinence'. In the former usage a relevant document is
simply one which deals (to a greater or lesser extent) with the same subject
matter as that of the query whereas for a document to be pertinent it has to
contain information which is new and useful to the originator of the query in
the subject area of the query. Obviously knowledge of pertinent documents
is more important than knowledge of those which are only about the same
subject as the query. However to establish the pertinence of documents
requires real users, with real queries, who have the inclination to peruse
entire documents. The availability of such a committed user group is rare.
In the experiment reported here relevance was used with the meaning of
`aboutness'. Although the distinction was never spelled out to the users the
fact that their assessments were based on less than the whole document
ensured this. Also no attempt was made to establish the extent to which users
followed up the documents notified to them.
Search software
Software for the project was specially written by the INSPEC Systems
Development Department. A generalized search package was developed
rather than separate optimum programs specially tailored to the requirements
of each search strategy
The various search facilities available in the package are detailed in the
original report5 and are not of major concern here. Suffice it to say that the
following were included: boolean AND, OR, NOT, with (practically)
unlimited nesting, quorum logic, contextual logic, positive and negative
integer weights (decimal or `powers of 2'), matching in upper and lower case
and in normal, inferior and superior alignment, left and right hand
truncation, universal character, etc.
The most important point concerning the software was that with a
generalized search package rather than separate optimal programs, the
amount of information obtainable on computer costs for the different search
strategies was limited. This is discussed further on pp.306 et seq.
14.3 Results
Retrieval performance
The doubts raised in the literature over the years concerning the rather
intangible nature of the `relevance' concept have naturally been extended to