Information Retrieval Experiment

IRE Information Retrieval Experiment Simulation, and simulation experiments chapter Michael D. Heine Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 196 Simulation, and simulation experiments a context of information need before making his decision (which if true would create ambiguity itself[OCRerr]the arbiter's mind may roam over a whole spectrum of possible occasions on which the question could have been put)? Or is the arbiter to look for merely linguistic similarity between question and (say) document title? A better experiment would involve relevance judgements being made in a real context of information need, with a `question' appearing not as a term of reference of the judgement but as an articulant serving simply to explore a database in response to an information need. In the latter case it is clearly a variable, for a given instance of need. Few phrases have done more damage to the advancement of information retrieval than that of `relevance to a question' in the author's opinion, in view of the influence on experimental design that that phrase has had, and in view of its inhibiting our awareness of the non-verbal (primitive) basis of relevance decisions. In the writer's view this is simply a consequence of faulty system delimitation. This criticism is destructive of course, and in place of the faulty methodology others more satisfactory than it are in consequence required. I': principle the correct methodology would be to ask a user to examine all itema in a database, and simply note whether each is relevant or non-relevant in relation to a fixed (non-verbal) notion of information need that he says he recognizes. Since this is impracticable for large databases, the two following experimental methodologies might usefully be considered in place of it. The first is to record behavioural evidence of relevance decision making, e.g. the records `used' (in some sense) by the arbiter, or the documents cited in a document written by him. The second methodology is what we term here the `Virtual Attribute Technique'. This would entail (1) masking from the search vocabulary a given term (or other attribute), so that that term becomes virtual or invisible, (2) partitioning the collection to be searched using that term (the relevant set then being the subset of the collection that bears the (virtual) term), and (3) using the remaining search vocabulary in the usual way to try to identify the relevant set so identified. This would have the advantages or objectivity and stability in the relevance-assessments (made implicitly by the indexing staff involved in the creation of the database), as well as being consistent with the reality of search-vocabulary development: whereby new terms are introduced with just the purpose of capturing (novel) concepts or relevance. (For relevant stimulating discussion, see Jablonski35.) There is no reason why this technique could not be applied to real (operational) databases. Techniques such as the Virtual Attribute Technique are urgently needed if simulation work in particular is to develop usefully beyond 1t8 present state. They may even serve to diminish our reliance on experimental work as the latter has tended to be construed: as a recording of relevance decision making et al. in a laboratory-like environment, rather than as work directed at data gathering in a real `user/database interaction' environment. 10.4 Conclusions The work undertaken on the simulation of information retrieval systems appear[OCRerr] to ha[OCRerr]e five general features. First it should not be seen in isolation from its natural context-that of information science. It is a mistake to see [OCRerr]irnu1i'.tiop work [OCRerr][OCRerr]C1V froi[OCRerr] [OCRerr] narrow technical viewpoint. since it re:'.dilv J