SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Appendix C: System Features appendix National Institute of Standards and Technology Donna K. Harman a. term weighting yes b. Boolean Connectors (AND, OR, NOT) Availal)le. Not used. C. proxilnity operators Automatic d. addition of tenus not jiicluded in topic yes--I)ased on user judgment (1) 5()U[OCRerr]CC of tenfis e. other (describe) C. Feedback (ad hoc) AvailaI)Ie. Not 5U1)Illltted in TREC. D. AutolnaticUly built quenes (routing) Av[OCRerr] liable. Not Submitted in TREC. E. Manually constructed quenes (I'()utin(') Available. Not 5U1)Illltted in TREC. III. Searcililig A. Tot[OCRerr][OCRerr] computer tilne to search (cpu seconds) 2-10 seconds, dep. on (juery 1. reLlieval t[OCRerr]e (tot[OCRerr] CPU seconds between when a query enters [OCRerr]e system until a list of document numbers LUC obtained) see above 2. ranking time (t()t11 CPU seconds to sort d('cument list) Included in number above B. Which methods best describe [OCRerr]()U[OCRerr] machine se[OCRerr]tiching me[OCRerr]()ds? 1. vector space m(xiel Some teCllIlI(1Ue5 used 2. probabilistic model Some probability used in ranking 5. Boolean m[OCRerr][OCRerr]ching Available. Not used in TREC. 6. fuzzy logic (include y()Lll defluition) Fuzzy semantic net links used in term explosion. 8. neural networks No--See 6 9. conceptual graph matching Yes--query concept created by explosion C. What factors are included in yow- ranking? 1. tenn fi-equency 2. inverse d([OCRerr]ument fiequency Available. Not used in TREC. 3. other term weights (where do they COIflC from'?) Manual 4. sem('[OCRerr]tic Closeness (LL[OCRerr] in sein[OCRerr]tntic net distance) yes 5. position in document Available. Not used In TREC. 6. syntactic clues (state how) Availal)le. Not used in TREC. 7. proximity of terms 9. document lengtli 10. completeness (what (;/,, of the query terms are present) 15. other (specify) User cII()()ses--pr()grammable IV. What machine did YOU conduct [OCRerr]e TREC experilnent on? Sun SPARC II How much RAM did it have? 64 Mbytes What wa-s the clock rate of [OCRerr]e CPIJ? 50 MHz V. Some systems are research prototypes and others are commercial. To help compare fliese systems: 1. How much "software en([OCRerr][OCRerr]ineering" went into the development of your system? The underlying "engine" used f[OCRerr])r TREC is also used in a commercial product (C()nQuest)--llence, lots of SIW engineering is behind it. 2. 6iven appropriate resources, could your system be made to ruii f£[OCRerr]ter? By how much (estimate)? Yes--at least a factor ([OCRerr]f 2 3. What features is your system missing that it would benefit by if it had them? Subject domain add-on dictionary. 504