SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Appendix C: System Features
appendix
National Institute of Standards and Technology
Donna K. Harman
a. term weighting yes
b. Boolean Connectors (AND, OR, NOT) Availal)le. Not used.
C. proxilnity operators Automatic
d. addition of tenus not jiicluded in topic yes--I)ased on user judgment
(1) 5()U[OCRerr]CC of tenfis
e. other (describe)
C. Feedback (ad hoc) AvailaI)Ie. Not 5U1)Illltted in TREC.
D. AutolnaticUly built quenes (routing) Av[OCRerr] liable. Not Submitted in TREC.
E. Manually constructed quenes (I'()utin(') Available. Not 5U1)Illltted in TREC.
III. Searcililig
A. Tot[OCRerr][OCRerr] computer tilne to search (cpu seconds) 2-10 seconds, dep. on (juery
1. reLlieval t[OCRerr]e (tot[OCRerr] CPU seconds between when a query enters [OCRerr]e system until a list of
document numbers LUC obtained) see above
2. ranking time (t()t11 CPU seconds to sort d('cument list) Included in number above
B. Which methods best describe [OCRerr]()U[OCRerr] machine se[OCRerr]tiching me[OCRerr]()ds?
1. vector space m(xiel Some teCllIlI(1Ue5 used
2. probabilistic model Some probability used in ranking
5. Boolean m[OCRerr][OCRerr]ching Available. Not used in TREC.
6. fuzzy logic (include y()Lll defluition) Fuzzy semantic net links used in term explosion.
8. neural networks No--See 6
9. conceptual graph matching Yes--query concept created by explosion
C. What factors are included in yow- ranking?
1. tenn fi-equency
2. inverse d([OCRerr]ument fiequency Available. Not used in TREC.
3. other term weights (where do they COIflC from'?) Manual
4. sem('[OCRerr]tic Closeness (LL[OCRerr] in sein[OCRerr]tntic net distance) yes
5. position in document Available. Not used In TREC.
6. syntactic clues (state how) Availal)le. Not used in TREC.
7. proximity of terms
9. document lengtli
10. completeness (what (;/,, of the query terms are present)
15. other (specify) User cII()()ses--pr()grammable
IV. What machine did YOU conduct [OCRerr]e TREC experilnent on? Sun SPARC II
How much RAM did it have? 64 Mbytes
What wa-s the clock rate of [OCRerr]e CPIJ? 50 MHz
V. Some systems are research prototypes and others are commercial.
To help compare fliese systems:
1. How much "software en([OCRerr][OCRerr]ineering" went into the development of your system?
The underlying "engine" used f[OCRerr])r TREC is also used in a commercial product
(C()nQuest)--llence, lots of SIW engineering is behind it.
2. 6iven appropriate resources, could your system be made to ruii f£[OCRerr]ter? By how much
(estimate)? Yes--at least a factor ([OCRerr]f 2
3. What features is your system missing that it would benefit by if it had them?
Subject domain add-on dictionary.
504