IRE Information Retrieval Experiment The pragmatics of information retrieval experimentation chapter Jean M. Tague Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Decision 3: How to operationalize the variables? 63 im Q - [OCRerr] S(c, (4) and Q[OCRerr] is the compactness of the collection when termj is deleted: im Qi = - [OCRerr] S*(c, (4) m[OCRerr]=1 where S*(c, (4)- Zk*Jckdik ([OCRerr]k[OCRerr]Ck2Zk[OCRerr]dŁk2)112 Thus, a term is discriminating to the extent that it decreases the average similarity of the document set. (6) Degree of pre-coordination of index terms. Operational definition: number of index terms per index phrase. Averaging may take place over either all entries (types) in the dictionary of the language or over all tokens in the database. (7) Degree of syntactic control (i.e. roles, links, relational operators). Operational definition: number of operators per documents. Since this measure confounds indexing exhaustivity and syntactic control, a better measure might be the ratio: number of operator assignments/number of index term assignments. (8) Accuracy of indexing. Operational definition: number of indexing errors, as determined by a judge or by reference to a standard set of term assignments. Two types of errors are distinguished: of omission (a term omitted) and of commission (an incorrect term added). Validity presents a problem here: why is the judge or standard more `correct'? It is difficult to validly assess indexing correctness without retrieval. Retrieval, however, is no real solution. Why should a particular set of queries be used to test indexing? No indexer or judge can foretell all future uses of a document. The best one can do is assume that the best judges will be those who have worked with the users of the collection. (9) Inter-indexer consistency. Operational definition: various ratios lying between 0 and 1 have been suggested, for example N(A rm B) N(Ar>B) or N(A[OCRerr]B) (N(A)N(B)) 1/2 where N(A) and N(B) are the numbers of index terms assigned by indexers A and B, respectively, N(A [OCRerr] B) is the number of terms assigned by both A and B, and N(A [OCRerr] B) is the number of terms assigned by either Aor B. Queries, search statements, and the search process `Query' will be used here to mean the verbalized statement of a user's need. A `search statement' is a single string, expressed in the language of the system, which triggers a search of the database, i.e. causes a search algorithm to scan the database and output a response. A `search process' is a sequence of search statements, all relating to the same query.