IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Decision 9: How to an[OCRerr]ysc the d[OCRerr]t:t? 09 variables, for example vocabulary size as a logarithmic function of collection size. Confidence intervals may be set up for predicted values, however, the accuracy and reliability depends upon an assumption of at least approximate normality. Although superficially like the preceding problem, forecasting future values of some variable on the basis of past values is not really amenable to regression techniques. This is because regression is based on the assumption of independent observations. Time series, such as daily use of a system or monthly recall/precision figures for an SDI profile, are obviously dependent observations[OCRerr])ne day's or month's value is related to previous ones. Time series analysis, which consists of analysing a series in terms of trends, periodic or seasonal components, and random fluctuations, is discussed in detail in a number of monographs, for example Gilchrist26 and Box and Jenkins27. Implementation Finally, there is the question of the medium for data analysis. There are two ways: (1) Manual tabulations, possibly using hand calculators. This is convenient in the sense that it can be done internally, but may not in the long run be the least expensive method. It is necessary, of course, for the analysis of non-formated data. Manual tabulations have a very high probability of error, so that, to be sure of results, all calculations must be verified. This can be very tedious, particularly if results do not tally the first time. (2) Computer-based statistical packages. The chance of error is much reduced here, though, of course, data input must still be verified. The best known statistical packages are SPSS (Statistical Package for the Social Sciences), SAS (Statistical Analysis Package), and BMD (Biomedical Computer Programs), and it is probably best to use one of these if you are carrying out a wide range of different types of analysis on the same data. The actual tests available with these packages vary, to some extent, from installation to installation. For example, some installations have non- parametric tests not described in the SPSS Manual. A useful introduction to the three packages listed above will be found in Moore28. It is important, however, to understand the function of the different tests in the packages. Their very comprehensiveness makes them susceptible to misuse. Anyone contemplating the use of statistical packages should study the manual carefully prior to data collection. Much time and expense at the data analysis stage can be saved by collecting data in a form that is amenable to entry into an SPSS or other package file. Basically, data is entered case by case, each case consisting of several fields defining characteristics of the case. Sometimes there is a problem in deciding what is a case. For example, in a study of retrieval, is a case a searcher, a user, a query, a search, a search statement. It all depends on the purpose of the analysis. A case should be the simplest, most atomic experimental unit to be examined in the study. If users have several queries and queries consist of a sequence of search statements, and if interest is in the effectiveness of various ways of structuring search statements, then a case is a single search statement.