Information Retrieval Experiment

IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Decision 6: How to process queries? 75 user group comes. If all users in the test are scientists, results cannot be extrapolated to laymen using the same collection, for example. Sometimes the expertise of searchers or the degree of delegation of the search are independent variables in the experiment. If not, they should either be kept constant for all searches (best approach if the query set is small) or varied randomly (if the query set is large). If search experience is to be held constant, it should be at a high level. The variability that can result from inexperienced searchers may be much greater than the variability resulting from the different treatments under test. Again, this means obtaining the co- operation of experienced systems personnel well before the experiment begins and/or offering payment or other inducement. Also, it is known that the degree of subject competence affects relevance judgements, i.e. ajudge who is familiar with a subject is less likely to accept a document as relevant than one who is not. It is a good idea to get some measure of inter-judge consistency in relevance assessments if the real users are not doing the assessments. Measures similar to those proposed for inter- indexer consistency may be used. Queries must be clearly stated. If users are the source, they should provide an initial statement of the query in written or taped form. Of course, this may be modified during search strategy construction and/or interactive searching, but the starting point should be clear. If selections are being made from a repository of queries, those that are unclear on any count should be rejected. 5.6 Decisiofi 6: How to process queries? In comparative searching, it is essential that all things other than the variables under test should be equal. This principle is easier to enforce in a laboratory situation than an operational one. In any test, searchers should be provided with sets of instructions, either as a printed manual or online tutorial. In addition, training and practice sessions for all searchers should be held prior to the experiment. Frequently, problems which would have arisen during the experiment can be spotted at this time. Decide before the experiment what output format is needed and instruct all searchers to this effect. Unless their use is to be manipulated experimentally, all searchers should have equal access to such search aids as lists of computer commands, index language dictionaries and thesauri, and sample searches. Sometimes, particularly in laboratory experiments, the investigator may wish to make searches of the same query in different languages or systems as alike as possible. Various methods of achieving this control have been used: putting queries into an intermediate language, restricting the search as to time, number of retrieved documents or number of relevant documents, use of a common threshold level with ranked document output. Many extraneous sources of variation which can occur during computer searching can be eliminated by careful prechecking of the search environ- ment. Are all terminals, data sets, printers, etc., in good operating condition? Are all necessary supplies[OCRerr]paper, pencils, manuals equally accessible to all search personnel? Will someone, preferably the experimenter, be on the scene to handle the inevitable problems and breakdowns which occur?