SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Retrieval Experiments with a Large Collection using PIRCS
chapter
K. Kwok
L. Papadopoulos
K. Kwan
National Institute of Standards and Technology
Donna K. Harman
b. term weighting
(1) with weights based on terms in topics YES
(2) with weights based on terms in all training documents
YES
with relevance judgme
nts YES
NO
(3) with weights based on terms from documents
C. phrase extraction
(1) from topics
(2) from all training documents
(3) from documents with relevance judgments
d. syntactic parsing NO
(1) of topics
(2) of all training documents
(3) of documents with relevance judgments
e. word sense disambiguation NO
(1) using topic
(2) using all training documents
(3) using documents with relevance judgments
f. proper noun identification algorithm NO
(1) from topics
(2) from all training documents
(3) from documents with relevance judgments
g. tokenizer (recognizes dates, phone numbers, common patterns)
NO
(1) which patterns are tokenized?
(2) from topics
(3) from all training documents
(4) from documents with relevance judgments
h. heuristic associations to add terms NO
(1) from topics
(2) from all training documents
(3) from documents with relevance judgments
i. expansion of queries using previously-constructed data structure
(from part I) YES
(1) which structure? WORD-PAIR PHRASE FILE
j. automatic addition of Boolean connectors or proximity operators
NO
(1) using information from the topics
(2) using information from the all training documents
(3) using information from documents with relevance judgments
k. other [OCRerr]rief description) NO
E. Manually constructed queries (routing)
1. topic fields used
2. average time to build query (minutes)
3. type of query builder
a. domain expert NO
b. system expert YES
4. data used for building query
a. from training topic YES
b. from all training documents NO
c. from documents with relevance judgments NO
d. from other sources (what?) NO
170
[OCRerr]ThE>, <DESC>, <NARR>, <CON>
300 MINUThS FOR 25 QUERIES.