SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Appendix C: System Features
appendix
National Institute of Standards and Technology
Donna K. Harman
System Summary and Timing
TRW
General Coininenis
The fimings should be tlie tilne to replicate runs from scratch, not including tnal runs, etc. Tlie tilnes should also
be reasonably accurate. This soinetilflCs will be difticult, such as gettin[OCRerr] total time lor document indexing of huge
text sections, or manually building a k'iowledge base. Please do your best.
I. Construction of indices, knowledge bases, and other da[OCRerr] structures (please describe all da[OCRerr] structures that
your system needs for se[OCRerr]chi'ig)
A. Which of [OCRerr]e following were used to build your da[OCRerr]i structures?
None--we used a free text scanning approach. CD-ROM data was decompressed and loaded
onto niagnetic disk in raw form.
B. Statistics on d[OCRerr]ta structures built from TREC text (please fill out each applicable sectk)n) --none
C. Data built from sources other th'ui [OCRerr]c input text --none
II. Query construction
(please fill out a section for each query construction method used)
A. Automatically built queries (ad hoc)
We performed some initial trials with 1)uilding queries l)ased on word frequency taken from
documents from the initial relevance judgments supplied l)y NIST. Unfortunately, this
appeared to lead us down a l)lind alley, perhapS l)ecause the initial judgments were not all
that good. We are planning to try this again with the new judgments.
B. Manually constructed queries (£[OCRerr] h([OCRerr]) We primarily used this niethod f[OCRerr])r the TREC queries.
1. topic fields used
2. average t[OCRerr]e to build query (minutes)
The initial query would take a couple of nimutes to manually form l)y cutting and
pasting from the topic descriptionS with a text editor. "Feedliackt1 was the human
looking at the retrieved documents, comparing with the sample good documents
supplied l)y NIST, making independent judgments on document relevance, and
retining the query in an iterative manner.
3. type of query builder
b. computer system expert
4. tools used to build query Ilo special tooLs
a. word frequelicy list
b. knowledge base browser (knowledge base described in part I)
(1) which structure from part I
c. other lexical tools (identify)
5. which of the f[OCRerr])llowinLT were used?
b. Boolean connectors (AND, OR, NoT)
c. proxilnity operators
d. addition of terms not included in topic
(1) source of terms
Additional terms were supplied l)y human hased on outside
knowledge or from reading the text.
507