IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Decision 8: How to collect the data? 83 5.8 Decision 8: How to collect the data? At each stage of a partial or complete information retrieval test, information about various aspects of the experimental process becomes available. What information is actually collected should depend almost entirely on the purpose of the experiment. Extraneous information should not be collected just because it is there. However, if unusual things seem to be happening with some aspect not originally intended as part of the investigation, the investigator may want to collect data as a pilot study for a later full scale investigation. Data to be collected from information retrieval experiments divides into four categories: (1) Data about the database[OCRerr]overall characteristics such as size, distribution of indexing term postings, distribution of number of terms per document, distribution in terms of medium, form, source, age, and subject. (2) Data about people users, indexers, searchers, authors, managers, etc. sociodemographic characteristics, subject competence, experience, pref- erences, values, attitudes. (3) Data about processes indexing, searching, using documents, query negotiation time, cost, number of steps, types of activities and interactions (people-system, people-people). (4) Data about results[OCRerr]recall, precision, user satisfaction, efficiency, etc. Data about computerized files can be obtained by appropriate statistical processing. For manual files, the corresponding values may have to be estimated from samples. It is surprising how many operational systems, for example in libraries, keep virtually no statistics on collection size or distribution into different categories. Computer output from an analysis of the database should be in a form appropriate for incorporation in a report of the study. Clear print, capable of being reproduced, upper and lower case symbols, and some graphics capability should be obtainable from present-day computer installations. Graphics are important because often trends or patterns can be more readily detected in graphical rather than numerical data. Data on people involved in a study can be collected by observation, using a person or a recording device such as a camera or tape recorder, by interview, either in person or telephone, or by questionnaire. In all cases, the instruments used to record the data should be designed well in advance, preprinted where appropriate, and pretested. The analysis of such data should also be planned in advance, so that the forms can be designed and coded in a manner that will expedite the analysis. This advice is particularly important if analysis is by computer. For example, the investigator should know if the analysis programs can manipulate alphanumeric as well as numeric data (some SPSS implementations, for example, cannot). If alphanumeric cannot be handled, then response categories such as `excellent, good, average, fair, poor' should be coded with numbers, not letters. Data from observation and interview records can be keyed in much more rapidly if they are always entered in the same position on the recording instrument. For example, the right-hand side of a questionnaire can contain boxes showing question number[OCRerr]response number pairs. Remember that