SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Retrieval Experiments with a Large Collection using PIRCS chapter K. Kwok L. Papadopoulos K. Kwan National Institute of Standards and Technology Donna K. Harman 5. tools used to build query SOM[OCRerr]MES a. word frequency list b. knowledge base browser [OCRerr]nowledge base described in part I) (1) which structure from part I C. other lexical tools (identify) d. machine analysis of training documents (1) describe 5. which of the following were used? a. term weighting YES b. Boolean connectors (AND, OR, NOT) YES c. proximity operators NO d. addition of terms not included in topic NO (1) source of terms e. other (1)rief description) NO III. Searching A. Total computer time to search (cpu seconds) 1. retrieval time (total cpu seconds between when a query enters the system until a list of document numbers are obtained) 18-30 PER QUERY, NO SOFT-BOOLEAN (COMBINE 2 METHODS). 30-70" " WITH " " (COMBINE 3 METHODS). 2. ranking time (total cpu seconds to sort document list) 4 - 12 PER QUERY. B. Which methods best describe your machine searching methods 1. vector space model NO 2. probabilistic model YES 3. cluster searching NO 4. ngram matching NO 5. Boolean matching NO 6. fuzzy logic (include your definition) YES, SOFT-BOOLEAN 7. free text scanning NO 8. neural networks YES 9. conceptual graph matching NO 10. other (describe) NONE B. What factors are included in your ranking? 1. term frequency YES 2. inverse document frequency NO 3. other term weights (where do they come from?) INVERSE COLLECTION ThRM FREQUENCY TOTAL WORD OCCURRENCES. 4. semantic closeness (as in semantic net distance) NO 5. position in document NO 6. syntactic clues (state how) NO 7. proximity of terms NO 8. information theoretic weights NO 9. document length YES 10. completeness (what % of the query terms are present) NO 11. ngram frequency NO 12. word specificity (i.e., animal vs. dog vs. poodle) NO 13. word sense frequency NO 14. cluster distance NO 15. other (specify) NONE 171