IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. I I I 84 The pragmatics of information retrieval experimentation there are very few programs which can analyse natural language responses. As far as possible, categorize and code responses. A good discussion of the advantages, disadvantages and problems of the various techniques for collecting data about people will be found in Kerlinger20. Biases which are to be avoided include those caused by the observer's, interviewer's or subject's prejudices, inattention, and misunder- standing; and those related to the Hawthorne effect, i.e. the tendency of subjects under study to perform or respond in a manner different from normal. Response rate is a problem with mail questionnaires. Some tricks which seem to help are: including a `reward' with the questionnaire-pencils, notepads, lottery tickets, etc.-follow-up inquiries, particularly by telephone, a description of the purpose and sponsors of the research, promise of a summary of results. This last technique is especially useful with respondents in the same or allied fields. Some of the methods mentioned above can also be used to obtain a record of a searching or indexing process. Observation by a person is limited by what he or she can see or hear and, at the same time, record. Automatic recording by camera or by tape recorder, for such aspects of searching as query negotiation, is more efficient and reliable. In all cases, there are possible Hawthorne effects, unless the people involved are not informed they are under observation. However, in many institutions this last approach would be considered a breach of privacy. The norm in present-day research practice seems to be to make a visual or audio tape only if the subjects have given their permission. A record of an online search is obtained automatically from the search printout, which, in an experiment, should always be saved. It is also possible to dump this record onto a disk file for later printing or even automatic analysis. For other processes, subjects can be asked to keep a log or a diary, but this method is less reliable than the printout. Detailed instructions to all subjects can minimize the inconsistencies. Keen and Wheatley3 have described a useful form of `index marking' used in the EPSILON tests of printed index searching. The most intensive data collection usually occurs at the evaluation stage. Forms design and the coding of responses is important here too, if data is to be keyed into a machine-readable file. Mention has already been made of the desirability of supplying users with two output records, one for his/her own use and one to be returned with an evaluation. More general questions about user satisfaction and/or attitude can usually best be handled by questionnaire or interview. The investigator should look into the possibility of using machine readable instruments for data collection, such as Mark sense cards or optical character recognition (OCR) cards. Although these methods usually have a small error rate associated with them, this may be tolerable in view of the elimination of the input-keying stage. A cost comparison should be made. Group as well as individual assessments of system effectiveness should be considered. Standard techniques are: (1) A tape-recorded panel discussion by users, searchers, indexers, and others involved in the experiment. I