IRE
Information Retrieval Experiment
The pragmatics of information retrieval experimentation
chapter
Jean M. Tague
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Decision 6: How to process queries? 75
user group comes. If all users in the test are scientists, results cannot be
extrapolated to laymen using the same collection, for example.
Sometimes the expertise of searchers or the degree of delegation of the
search are independent variables in the experiment. If not, they should either
be kept constant for all searches (best approach if the query set is small) or
varied randomly (if the query set is large). If search experience is to be held
constant, it should be at a high level. The variability that can result from
inexperienced searchers may be much greater than the variability resulting
from the different treatments under test. Again, this means obtaining the co-
operation of experienced systems personnel well before the experiment
begins and/or offering payment or other inducement.
Also, it is known that the degree of subject competence affects relevance
judgements, i.e. ajudge who is familiar with a subject is less likely to accept
a document as relevant than one who is not. It is a good idea to get some
measure of inter-judge consistency in relevance assessments if the real users
are not doing the assessments. Measures similar to those proposed for inter-
indexer consistency may be used.
Queries must be clearly stated. If users are the source, they should provide
an initial statement of the query in written or taped form. Of course, this may
be modified during search strategy construction and/or interactive searching,
but the starting point should be clear. If selections are being made from a
repository of queries, those that are unclear on any count should be rejected.
5.6 Decisiofi 6: How to process queries?
In comparative searching, it is essential that all things other than the
variables under test should be equal. This principle is easier to enforce in a
laboratory situation than an operational one. In any test, searchers should be
provided with sets of instructions, either as a printed manual or online
tutorial. In addition, training and practice sessions for all searchers should be
held prior to the experiment. Frequently, problems which would have arisen
during the experiment can be spotted at this time. Decide before the
experiment what output format is needed and instruct all searchers to this
effect.
Unless their use is to be manipulated experimentally, all searchers should
have equal access to such search aids as lists of computer commands, index
language dictionaries and thesauri, and sample searches. Sometimes,
particularly in laboratory experiments, the investigator may wish to make
searches of the same query in different languages or systems as alike as
possible. Various methods of achieving this control have been used: putting
queries into an intermediate language, restricting the search as to time,
number of retrieved documents or number of relevant documents, use of a
common threshold level with ranked document output.
Many extraneous sources of variation which can occur during computer
searching can be eliminated by careful prechecking of the search environ-
ment. Are all terminals, data sets, printers, etc., in good operating condition?
Are all necessary supplies[OCRerr]paper, pencils, manuals equally accessible to
all search personnel? Will someone, preferably the experimenter, be on the
scene to handle the inevitable problems and breakdowns which occur?