SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Retrieval Experiments with a Large Collection using PIRCS
chapter
K. Kwok
L. Papadopoulos
K. Kwan
National Institute of Standards and Technology
Donna K. Harman
5. tools used to build query SOM[OCRerr]MES
a. word frequency list
b. knowledge base browser [OCRerr]nowledge base described in part I)
(1) which structure from part I
C. other lexical tools (identify)
d. machine analysis of training documents
(1) describe
5. which of the following were used?
a. term weighting YES
b. Boolean connectors (AND, OR, NOT) YES
c. proximity operators NO
d. addition of terms not included in topic NO
(1) source of terms
e. other (1)rief description) NO
III. Searching
A. Total computer time to search (cpu seconds)
1. retrieval time (total cpu seconds between when a query enters
the system until a list of document numbers are obtained)
18-30 PER QUERY, NO SOFT-BOOLEAN (COMBINE 2 METHODS).
30-70" " WITH " " (COMBINE 3 METHODS).
2. ranking time (total cpu seconds to sort document list)
4 - 12 PER QUERY.
B. Which methods best describe your machine searching methods
1. vector space model NO
2. probabilistic model YES
3. cluster searching NO
4. ngram matching NO
5. Boolean matching NO
6. fuzzy logic (include your definition) YES, SOFT-BOOLEAN
7. free text scanning NO
8. neural networks YES
9. conceptual graph matching NO
10. other (describe) NONE
B. What factors are included in your ranking?
1. term frequency YES
2. inverse document frequency NO
3. other term weights (where do they come from?)
INVERSE COLLECTION ThRM FREQUENCY
TOTAL WORD OCCURRENCES.
4. semantic closeness (as in semantic net distance) NO
5. position in document NO
6. syntactic clues (state how) NO
7. proximity of terms NO
8. information theoretic weights NO
9. document length YES
10. completeness (what % of the query terms are present) NO
11. ngram frequency NO
12. word specificity (i.e., animal vs. dog vs. poodle) NO
13. word sense frequency NO
14. cluster distance NO
15. other (specify) NONE
171