SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Okapi at TREC chapter S. Robertson S. Walker M. Hancock-Beaulieu A. Gull M. Lau National Institute of Standards and Technology Donna K. Harman feedback in citym2). In accordance with the general philosophy of Okapi, the TREC experiments were used to test the use of simple statistical techniques, with minimal linguistic processing, minimal searcher knowledge of techniques of searching, and indeed minimal effort generally. The routing test was intended to address mainly the value of relevance feedback (term selection and weighting) in a routing context where relevance judgements are accumulated from earlier runs. Automatic ad-hoc queries tested the weighting scheme without relevance information. Manual ad-hoc queries tested the combination of human intelligence with a simple weighting scheme, with and without feedback. 5.1 Automatic processing of topics The basic principle was to take specific section(s) of the topic and parse them in standard Okapi fashion, as if they had been typed in verbatim by a searcher. Thus stopwords were removed; a few phrases and/or members of synonym classes were identified; remaining words were stemmed; all search terms (stems or phrases or synonym classes) were weighted using IDF (see section 2.1). No special account was taken of the negative phrases which appear in some of the TREC topics, so that negated words would have been given positive weights by Okapi. The selection of the topic sections was the subject of a very small amount of initial experimentation using the training set. The differences were not very consistent and in some cases small, and more testing would have been useful. However, marginally the best overall was Concepts only, and that was what we used for the returned results. The results for the above automatic analysis of ad-hoc queries are given in the official tables as cityal. 5.2 Routing queries The principle on the routing queries was to assume that all the known relevant documents from the first document set were already available for a relevance feedback process. Thus any actual searches conducted on the first document set, and their actual outputs, played no direct part in the 25 formulation of the routing queries, with one exception discussed below. However, the terms extracted from the topics took part in the relevance feedback process in the manner indicated in section 3.4, with a bias equivalent to 10 supposed relevant documents in all of which the topic terms were supposed to occur (10 out of 10 bias). The exception to the above statement was that for some topics, some additional relevance assessments were made (that is, additional to those provided centrally). These were based on the top ranked documents retrieved in automatic searches on the first document set. (See section 6.1 for a discussion on the local relevance judgements and on the reasons for this decision.) The results for the above analysis of routing queries are given in the official tables as cityri. 5.3 Manual searching and feedback The central idea behind these experiments was to approach as closely as possible the situation of a naive or inexperienced user. In other words, we wanted to gain some idea of how the system would perform if searched by an end-user with little or no knowledge of information retrieval. This intention reflects the design principles of the interactive Okapi, as discussed in section 2 above. To some degree, however, both the design of the TREC experiment in general and the constraints of the distributed system described in section 3.1 forced deviations from that ideal. 5.3.1 Searchers The first constraint is of course that we had no access to end-users (and more particularly, no access to end-users with the specific characteristics of the TREC analysts). We used a panel of searchers, mainly information science students who could be said to have some knowledge of searching in general, limited domain knowledge (depending on the topic), and no particular knowledge of the system. [OCRerr]or reasons to do with the very limited time available for these searches, it was necessary to use project staff for a few searches; these staff obviously had more knowledge of the system.) The somewhat limited interface to the BSS which was used for this experiment required some training of the searchers.