NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)

SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Appendix C: System Features appendix National Institute of Standards and Technology Donna K. Harman g. total m[OCRerr]u1u('d tilne to build (approximate number of hours) 16 h. use of m([OCRerr]nu('1l l'!b()r (4) other (describe) Search for WSJ terminology in 1il)rary and from topics. 2. externally-built auxili('uy file [OCRerr]() II. Query construction (please fill out a section for each query construction method used) A. Automatic[OCRerr]tlly built queries (ad hoc) 1. topic fields used <TITLE>, <DESC>, <NARR>, <CON> 2. total computer tilne to build query (cpu seconds) 5 (average for each query). 3. which of the following were used? a. term weighting with weights b[OCRerr][OCRerr][OCRerr]ed or' terms in topics yes + others h. Cxp([OCRerr]1si()n of queries usin[OCRerr] previously-constructed data structure (from part. I) yes (1) which structure? word-pair phrase tile B. Manually constructed queries (ad 11(x) 1. topic fields used <TITLE>, <DESC>, <NARR>, <CON> 2. average t[OCRerr]e to build query (minutes) 3(XJ mjiiutes for 25 queries 3. type of query builder b. computer system expert 4. tools used to build query ㄅ. word frequency list sometimes b. knowledge base browser (knowledge base described in p[OCRerr]ut I) no c. other lexical tools (identify) 110 5. which of the following [OCRerr]vei-e used'? a. term weighting b. BOolean collilectors (AND, oR, N()T) d. additk)n of terms not included in topic (1) source of terms word-pair phrase tile C. Feedback (ad hoc) 1. initial query built by method 1 or meth('d 2'? method 1 2. type of person doing feedkick b. system expert 3. average tilne to do complete feedback a. cpu tilne (total cpu seconds for all iterations) 12 per query per iteration--no expansion `I"" " --with expansion b. cl(xk time from initial construction of query to completion of final query (minutes) 6([OCRerr] per query to do relevance judgment 4. average number of iterations I a. average nwnber of d(x'ulnents exΩOCRerr]nined per iteration 1(1 5. minimum number of iterations I 6. maximum number of iterations I 7. what determines the end of an iterition? deadline + lack of manpower 8. feedback methods used a. automatic term reweighting "loin relevant documents b. automatic query exp[OCRerr]'wsi()n from relevΩOCRerr]it documents (2) only top X terms added (what is X) Top 2E) most `activated' terms that have document frequency < 2([OCRerr]()() were used. Because many were already in query, ahout 12 on the average were new and added per query. 474