SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Appendix C: System Features appendix National Institute of Standards and Technology Donna K. Harman g. total m[OCRerr]u1u('d tilne to build (approximate number of hours) 16 h. use of m([OCRerr]nu('1l l'!b()r (4) other (describe) Search for WSJ terminology in 1il)rary and from topics. 2. externally-built auxili('uy file [OCRerr]() II. Query construction (please fill out a section for each query construction method used) A. Automatic[OCRerr]tlly built queries (ad hoc) 1. topic fields used <TITLE>, <DESC>, <NARR>, <CON> 2. total computer tilne to build query (cpu seconds) 5 (average for each query). 3. which of the following were used? a. term weighting with weights b[OCRerr][OCRerr][OCRerr]ed or' terms in topics yes + others h. Cxp([OCRerr]1si()n of queries usin[OCRerr] previously-constructed data structure (from part. I) yes (1) which structure? word-pair phrase tile B. Manually constructed queries (ad 11(x) 1. topic fields used <TITLE>, <DESC>, <NARR>, <CON> 2. average t[OCRerr]e to build query (minutes) 3(XJ mjiiutes for 25 queries 3. type of query builder b. computer system expert 4. tools used to build query £t. word frequency list sometimes b. knowledge base browser (knowledge base described in p[OCRerr]ut I) no c. other lexical tools (identify) 110 5. which of the following [OCRerr]vei-e used'? a. term weighting b. BOolean collilectors (AND, oR, N()T) d. additk)n of terms not included in topic (1) source of terms word-pair phrase tile C. Feedback (ad hoc) 1. initial query built by method 1 or meth('d 2'? method 1 2. type of person doing feedkick b. system expert 3. average tilne to do complete feedback a. cpu tilne (total cpu seconds for all iterations) 12 per query per iteration--no expansion `I"" " --with expansion b. cl(xk time from initial construction of query to completion of final query (minutes) 6([OCRerr] per query to do relevance judgment 4. average number of iterations I a. average nwnber of d(x'ulnents ex£[OCRerr]nined per iteration 1(1 5. minimum number of iterations I 6. maximum number of iterations I 7. what determines the end of an iterition? deadline + lack of manpower 8. feedback methods used a. automatic term reweighting "loin relevant documents b. automatic query exp[OCRerr]'wsi()n from relev£[OCRerr]it documents (2) only top X terms added (what is X) Top 2E) most `activated' terms that have document frequency < 2([OCRerr]()() were used. Because many were already in query, ahout 12 on the average were new and added per query. 474