SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Appendix C: System Features
appendix
National Institute of Standards and Technology
Donna K. Harman
C. Data built from sources other th[OCRerr]'ui [OCRerr]e input text --no
II. Query colistruction
(ple[OCRerr]i[OCRerr]e fill out a section I;()r e[OCRerr]ich query construction method used)
A. Aut()Inatic(IIly built queries ([OCRerr]`id hoc) yes
1. topic fields used <nuni> <title> <desc> and <narr>
2. toI[OCRerr]'il computer tilne to build query (cpu seconds) 3.()
3. which of the f()llowin([OCRerr] were used?
b. phrase extraction from topics
C. syntactic p[OCRerr]trsiilg of topics
C. proper IIOUll identification al[OCRerr]ontliin partial
g. heuristic £`Lssociati()I's to add terms
h. exp[OCRerr]uision of queries using previously-constructed data structure (from part I)
(1) which structure? term clusters
j. other (describe) syntactic phra%es
D. Automatically built queries (routin[OCRerr])
1. topic fields used same as in ad hoc
2. total computer tilne to build query (cpu seconds) 3.2
3. which of the following were used in building tlie query?
a. terms selected from
(2) all tr£'tining documents
b. tenil weighting
(2) with weiL[OCRerr]hts based on terms in
c. phrL[OCRerr]';e extraction
(I) from topics
(2) from all Inuning d()cument.[OCRerr]
d. syn([OCRerr]ictic p£'irsin[OCRerr]
(1) of topics
(2) of [OCRerr]`dl tr'.uning d([OCRerr]uments
f. proper noun idCIltiliCL'Lti()II algolithin
(1) from lopics partial
(2) fr[OCRerr])in all tr'.Lining documents partial
Ii. heuristic £[OCRerr]`;s()ciati()ns to add terms
(2) from all tr"uni'ig documenL[OCRerr]
i. expansion of queries using previ()usly-constructed data structure (from part I)
(1) which structure? clusters from training data
all training docwnents
III. Searching
A. Total computer tilne to se£'uch (cpu seconds)
1. retrieval tilne (total cpu seconds between when a query enters tlie system until a list of
document numbers are ObtL'lii'Cd)
TOTAL TIME (CPU + I/O) search and ranking is al)out 6([OCRerr] minutes per query
2. ranking tune (total cpu seconds to sort d(x:ument list)
B. Which methods best describe your machine searching mettiods?
1. vector space m(slel
C. What f[OCRerr]tctors are included in your rankiug?
1. tenn frequency
2. inverse d(icument frequency
478