SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Appendix C: System Features appendix National Institute of Standards and Technology Donna K. Harman System Summary and Timing Australian Computing and Communications Institute General Comments The fimings should be the time to replicate runs from scratch, not including trial runs, etc. The times should also be reasonably accurate. This sometimes will be difficult, such as gettin[OCRerr] total time for document indexing of huge text sections, or manually building a [OCRerr]owledge base. Please do your best. I. Construction of indices, knowledge bases, and other data structures (please describe all data structures that your system needs for se([OCRerr]ching) The software does not invert tile text. It inverts the (lueries (or through the c()ml)ined index formed from the ([OCRerr]ueries. filters) and passes the text II. Query construction (please fill out a section for each query construction method used) D. Automatically built queries (routin([OCRerr]) 1. topic fields used Concept field used 2. total computer time to build query (cpu seconds) < 5 seconds 3. which of the following were used in building the query? a. terms selected from (1) k)pic b. tenn weighting (3) with weights based on terms from documents with relevance judgments Terms weighted with weights hased ([OCRerr]n terms from documents with relevance judgments, and dynamically m(xlified through the training set and the test set. c. phrase extraction (1) from topics j. £iutomatic addition of B()()leall connectors or proximity operators (1) using inf()rmati()n from the topics E. Manually constructed queries (routin([OCRerr]) 1. topic fields used All topic fields used 2. average tilne to build query (minutes) 30 minutes 3. type of query builder b. system expert 4. data used fi)r building query a. from training topic 6. which of the fiAlowing were used? b. Boolean comiectors (AND, OR, NOT) c. proximity operators III. Searching A. Total computer time to search (cpu seconds) One message through 200 filters per second. This includes searching and ranking. 1. retrieval time (total cpu seconds between when a query enters the system until a list of document numbers are obtained) 492