IRE
Information Retrieval Experiment
Gedanken experimentation: An alternative to traditional system testing?
chapter
William S. Cooper
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
`V
Examples 205
little training. Some possible forms which such training might take have
been proposed elsewhere9. Fourth, for some purposes it would be desirable
to have the capability of testing an indexer's skill in making the requisite
guesses; this has also been discussed elsewhere. Finally, it might be objected
that since indexers cannot foretell the future they would be unable to make
the required probability estimates with any high degree of accuracy, either
with gedanken experimentation or without. But it was never claimed they
could. It was merely suggested that there is likely to be a tendency for the
numbers they come up with under the gedanken approach to be less
inaccurate as probability estimates than the numbers they would otherwise
come up with. And since in the last analysis an output ranking is always
either explicitly or implicitly a ranking by estimated probability, any
improvement in the accuracy of estimation is a step forward.
Example 2: The indexer as gedanken experimenter, unweighted indexing
Next onsider an even simpler retrieval system in which the indexing is
unweighted (or `binary'), where the searcher submits a single term as his
request, and where the system responds by retrieving for him as an unranked
set all documents indexed under the request term. The common subject card
catalogue is a system of this sort with minor elaborations.
To decide whether or not to assign a term to a document, an indexer
indexing documents for use in such a system can make use of what I have
elsewhere called the `Odds-Payoff' decision rule9' [OCRerr]O. Three steps are required.
First the indexer estimates the odds against satisfaction after the fashion of
the mental experiments of the previous example. Second, he performs
another thought experiment whose result is a judgement of how many
unsatisfactory documents a typical requestor submitting the term under
consideration would be willing to examine and discard as the penalty to be
paid to obtain the document to be indexed. Finally, he compares these two
numbers and assigns the term if and only if the latter exceeds the former.
A variant of this procedure involves substituting a standard average value
for the figure obtained in the second step, thereby eliminating that step and
greatly simplifying the indexing process. The price of the simplification is
that variations in degrees of predicted usefulness among the documents are
ignored.
Example 3: The requestor as gedanken experimenter
Retrieval requests containing user-weighted terms have been in use for some
time, but the weights are usually regarded vaguely as indicators of
`importance' rather than as estimates of probabilities or functions of
probabilities. Moreover, the weights are not manipulated by the system as
though they had a probabilistic interpretation. Might it be possible to regard
the weights as probabilistic estimates of some kind, and reformulate the
retrieval rules so that the weights are treated as such and used to compute
explicit estimates of the final document probabilities?
A crude system using ordinary unweighted document indexing but capable
of handling request-term weights probabilistically might be designed
somewhat as follows. A request consists as in most ordinary weighted-request
I