SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Overview of the Second Text REtrieval Conference (TREC-2) chapter D. Harman National Institute of Standards and Technology D. K. Harman Training Topics L~mi1OO)\ mTest Topics LrFj;50) There were guidelines for constructing and manipulating the system data structures. These structures were defined to consist of the original documents, any new structures built automatically from the documents (such as inverted files, thesauri, conceptual networks, etc.), and any new structures built manually frorn the documents (such as thesauri, synonym lists, knowledge bases, rules, [OCRerr]tc.). The following guidelines were developed for the TREC task. Qi Q2 [11[OCRerr]Th1ning[OCRerr]Th[OCRerr]est Documents and 2) (Disk 3) Figure 1. The II[OCRerr]EC Task. The test design was based on traditional information retrieval models, and evaluation used traditional recall and precision measures. The above diagram of the test design shows the various components of IREC (fig. 1). This diagram reflects the four data sets (2 sets of topics and 2 sets of documents) that were provided to partici- pants. These data sets (along with a set of sample rele- vance judgrnents for the 100 training topics) were used to construct three sets of queries. Qi is the set of queries (probably multiple sets) created to help in adjusting a sys- tem to this task, to create better weighting algorithms, and in general to train the system for testing. The results of this research were used to create Q2, the routing queries to be used against the test documents. Q3 is the set of queries created from the test topics as adhoc queries for searching against the training documents. The results from searches using Q2 and Q3 were the official test results sent to NIST. 2.1 Specific Task Guidelines Because the ThEC participants used a wide variety of indexin[OCRerr][OCRerr]owledge base building techniques, and a wide variety of approaches to generate search queries, it was important to establish clear guidelines for the evaluation task. The guidelines deal with the methods of index- mg/knowledge base construction, and with the methods of generating the queries from the supplied topics. In gen- eral, they were constructed to reflect an actual operational environment, and to allow as fair as possible a separation among the diverse query construction approaches. 2 1. System data structures should be built using the initial training set (documents from disks 1 and 2, training topics 1-100, and the relevance judg- ments). They may be modified based on the test documents from disk 3, but not based on the test topics. 2. There are parts of the test collection, such as the Wall Street Journal and the Ziff material, that con- tain manually assigned controlled or uncontrolled index terms. These fields are delimited by SGML tags, as specified in the documentation files included with the data. Since the primary focus is on retrieval and routing of naturally occurring text, these manually indexed terms should not be used. 3. Special care should be used in handiing the rout- ing task. `[`a true routing situation, a single docu- ment would be indexed and compared against the routing topics. Since the test documents are gen- erally indexed as a complete set, routing should be simulated by not using any information based on the full set of test documents (such as weighting based on the test collection, total frequency based on the test collection, etc.) in the searching. It is permissible to use training-set collection informa- tion however. Additionally there were guidelines for constructing the queries from the provided topics. These guidelines were considered of great importance for fair system compari- son and were therefore carefully constructed. Three generic categories were defined, based on the amount and kind of manual intervention used. 1. AUTOMATIC (completely automatic iuitial query construction) adhoc queries -- The system will automatically extract information from the topic to construct the query. The query will then be submitted to the sys- tem (with no manual modifications) and the results from the system will be the results submitted to NIST. There should be no manual intervention that would affect the results. routing queries -- The queries should be