IRE Information Retrieval Experiment Gedanken experimentation: An alternative to traditional system testing? chapter William S. Cooper Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 204 Gedanken experimentation: An alternative to traditional system testing? 11.3 Examples A few simple examples of hypothetical retrieval systems based on the thought-experiment approach may help clarify what is involved. Example 1: The indexer as gedanken experimenter Consider a simple retrieval system capable of responding only to single-term requests, but in which the indexing of the documents is weighted-each term assigned to a document has an associated numeric value indicative of its suitability as a descriptor for the document. When a request is received, the system simply ranks for the requestor all the documents to which the request term has been assigned, in descending order of the weight of the assignment. To make this system explicitly probabilistic one need only instruct the indexers to restrict the numbers they use as weights to the interval between 0 and 1, and to think of these weights as probabilities. Thus if the indexer thinks there is one chance in ten of the document at hand satisfying a user submitting the term under consideration, he should assign that term to the document with weight 0.1. The gedanken experiment he performs in order to arrive at such a figure might run somewhat as follows. The indexer imagines all future system users whose request is the term in question to be transported backward in time and gathered together into a room. They are then in imagination asked to read or examine carefully copies of the document to be indexed and to raise their hand if it would satisfy at least partially the information need that caused them to submit their request. The proportion the indexer thinks would raise their hands is the weight to be assigned. A variant of this mental experiment would have the indexer imagine a future searcher under the term to be drawn at random. The indexer would then ask himself, `If forced to make a small wager, what odds would I be (barely) willing to give in a bet that this searcher would, when the time came, find the document I am about to index to be satisfactory, given that I index it in such a way that he is led to examine it?' It is a simple matter in probability theory to translate a betting odds into an approximate subjective probability estimate; in fact for unlikely events the odds and the corresponding probabilities are almost equal. Thus if the indexer found himself willing to give 1:10 odds for satisfaction (i.e. ten to one odds against satisfaction), he would again be led to attach to the term a weight of approximately 0.1. Several points are worth noting about this example. First, for the sake of a simple procedure all considerations of degree of satisfactoriness (that is, all utility-theoretic considerations) have been omitted. There are elaborations of the foregoing gedanken experiments which could take such considerations into account, if they were deemed worth while. Second, the retrieval rule-to rank by the indexing weight of the term submitted as request-is so simple that no numeric computation whatsoever would have to be carried out by the system, which could in fact be easily implemented manually. Third, although one might expect an indexer to make more useful guesses under the suggested interpretation of the weights than under no interpretation at all or a vague one about term `importance', it would be desirable to provide him with a