SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Appendix C: System Features
appendix
National Institute of Standards and Technology
Donna K. Harman
g. total m[OCRerr]u1u('d tilne to build (approximate number of hours) 16
h. use of m([OCRerr]nu('1l l'!b()r
(4) other (describe) Search for WSJ terminology in 1il)rary and from topics.
2. externally-built auxili('uy file [OCRerr]()
II. Query construction
(please fill out a section for each query construction method used)
A. Automatic[OCRerr]tlly built queries (ad hoc)
1. topic fields used <TITLE>, <DESC>, <NARR>, <CON>
2. total computer tilne to build query (cpu seconds) 5 (average for each query).
3. which of the following were used?
a. term weighting with weights b[OCRerr][OCRerr][OCRerr]ed or' terms in topics yes + others
h. Cxp([OCRerr]1si()n of queries usin[OCRerr] previously-constructed data structure (from part. I) yes
(1) which structure? word-pair phrase tile
B. Manually constructed queries (ad 11(x)
1. topic fields used <TITLE>, <DESC>, <NARR>, <CON>
2. average t[OCRerr]e to build query (minutes) 3(XJ mjiiutes for 25 queries
3. type of query builder
b. computer system expert
4. tools used to build query
£t. word frequency list sometimes
b. knowledge base browser (knowledge base described in p[OCRerr]ut I) no
c. other lexical tools (identify) 110
5. which of the following [OCRerr]vei-e used'?
a. term weighting
b. BOolean collilectors (AND, oR, N()T)
d. additk)n of terms not included in topic
(1) source of terms word-pair phrase tile
C. Feedback (ad hoc)
1. initial query built by method 1 or meth('d 2'? method 1
2. type of person doing feedkick
b. system expert
3. average tilne to do complete feedback
a. cpu tilne (total cpu seconds for all iterations)
12 per query per iteration--no expansion
`I"" " --with expansion
b. cl(xk time from initial construction of query to completion of final query (minutes)
6([OCRerr] per query to do relevance judgment
4. average number of iterations I
a. average nwnber of d(x'ulnents ex£[OCRerr]nined per iteration 1(1
5. minimum number of iterations I
6. maximum number of iterations I
7. what determines the end of an iterition? deadline + lack of manpower
8. feedback methods used
a. automatic term reweighting "loin relevant documents
b. automatic query exp[OCRerr]'wsi()n from relev£[OCRerr]it documents
(2) only top X terms added (what is X)
Top 2E) most `activated' terms that have document frequency <
2([OCRerr]()() were used. Because many were already in query, ahout 12 on
the average were new and added per query.
474