SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Retrieval Experiments with a Large Collection using PIRCS chapter K. Kwok L. Papadopoulos K. Kwan National Institute of Standards and Technology Donna K. Harman if not, approximately how many hours of manual labor? 0.5 d. are term positions within documents stored? NO, BUT SENTENCE YES. YES, EXCEPT FOR I.A.14. NO f. single terms only? 2. clusters a. total amount of storage (megabytes) b. total computer time to build (approximate number of hours) C. brief description of clustering method d. is the process completely automatic? if not, approximately how many hours of manual labor? 3. ngrrrns, suffix arrays, signature files NO a. total amount of storage (megabytes) b. total computer time to build (approximate c. brief description of methods used d. is the process completely automatic? if not, approximately how many hours of 4. knowledge bases a. total amount of storage (megabytes) b. total number of concepts represented c. type of representation (frames, semantic nets, rules, etc.) d. total computer time to build (approximate number of hours) e. total manual time to build (approximate number of hours) f. use of manual labor (1) mostiy manually built using special interface (2) mostly machine built with manual correction (3) initial core manually built to "bootstrap" for completely machine-built completion (4) other (describe) g. auxiliary files needed for machine use (1) machine-readable dictionary (which one?) (2) other (identify) 5. special routing structures (what?) SEE I.B.6 NETWORK NODE, EDGE FILES. ROUTING USING NETWORK NODE AND EDGE FILES IS SThAIGIflFORWARD. number of hours) manual labor? NO a. total amount of storage (megabytes) NODE FILE: 4x7.5 EDGE FILE: 4x4 NETWORK SEGMENTED INTO 4, BECAUSE OF INSUFFICIENT RAM. b. total computer time to build (approximate number of hours) 4O+5+l+4xO.2=46.8, STARTING FROM TEXT FILE. c. is the process completely automatic? YES, IF SUFFICIENT RAM AND DISK SPACE. d. brief description of methods used 1. PROCESS (OLD) COLLE[OCRerr]ON A. 2. PROCESS QUERIES AGAINST COLLE[OCRerr]ON A. 3. PROCESS NEW COLLECTION B AS IF THEY WERE QUERIES TO MAKE USE OF COLLE[OCRerr]ON A ST[OCRerr][OCRerr]CS. 4. COMBINE QUERIES, (OLD) DI[OCRerr]ONARY AND COLLECTION B INTO NETWORK FOR RETRIEVAL. 6. other data structures built from TREC text (what?) 166