NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)

SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Appendix C: System Features appendix National Institute of Standards and Technology Donna K. Harman System Summary and Timing GE Research and Development Center General Conuneni[OCRerr] The timings should be the time to replicate runs from scr£'itch, not including trial runs, etc. The times should also be reasonably accurate. This sometimes will be difficult, such as getting total time ft)r document indexing of huge text sections, or inailually buildinQ a k'iowledge base. Please do your best. I. Construction of indices, knowledge bases, and oIlier data structures (please describe all data structures that your system needs for se[OCRerr]ircliing) We did rn[OCRerr] pre-indexing of the data B. Statistics on [OCRerr]ita structures built from TREC text no data provided C. Data built from sources other tliaii the input text --no II. Query construction (please fill out [OCRerr]`i section for each query construction method used) B. Manually constructed queries (ad 11(X) 1. topic fields used Mostly description, narrative, and concepts 2. aver£'Ige time to build query (minules) Al)out 2([OCRerr] minutes fi)r initial query 3. type of query builder b. computer system expert 4. tools used to build query b. knowledge base browser (knowledge bise described in part I) (1) which structure from part I inverted samples of corpus 5. which of the f()llowino were used? b. B(x)lean connectors (AND, OR, NOT) c. proximity operators d. addition of telins not included in topic (1) source of terms system lexicon, statistical analysis of samples matched l)y initial queries C. Feedback (ad hoc) We did not do feedliack, hut we did query refinement E. Manually constructed queries (r()utin[OCRerr]) Ad hoc and routing were d([OCRerr]ne using the same method 1. topic fields used 2. average time to build query (minutes) Ahout 2([OCRerr] minutes for initial query 3. type of query builder b. system expert 4. data used fi)r building query b. from all tr£uning d(Xuments statistical analysis of samples retrieved c. from documents with relevance judgments used for training, testing, and word frequency analysis 5. tools used to build query d. machine aiialysis of [OCRerr]iining documents (I) describe Word weighting analysis t6 determine what terms to add to queries 505