SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Appendix B: System Features
Appendix
National Institute of Standards and Technology
D. K. Harman
V. SYSThM COMPARISON
4 NAME [ Dortmund [OCRerr]_Cornell J Berkeley [ Rutgers [ Siemens [OCRerr] UMASS IVPi
None except for "Our" system is Basic system w[OCRerr]
the probabilistic essentially version of SMA
logic. The For the data lusion SMART; many enhanceti
Berkeley system is part, approximately SMART has INQUERY is a pnorm query p!
ch Several years Several an experimental 60 hours, for the been well- research added from pre
years prototype only, query combination engineered with system. About before TRE[OCRerr]
mg" went programmed as a parts, approximately a primary goal 10 person years TREC consist[OCRerr]
minimal 150 hours. of flexibility, not went into its of outside prog
lent? modification of the raw speed. development mergmg combir
SMART system. Modifications prior to these results from in[OCRerr]
made by the experiments. SMART retrie[OCRerr]
SMART group adding support
at Cornell for query and inde
last years TREC during a single
were used in
these runs.
Yes, see discussion Use of inverted
If the feature in SMART's enough disk sp
)propriate vectors for the documentation: have significani
S, could query terms were SMART is "not For the data lusion Yes, at least a the retrieval tir
tem be stored in a cache, Of course strongly optimized part, by a factor of factor of 2. `multiple retrie
) Run query regression for any one 8; for the other added to SMA
By How would take 20-30% particular use." parts, unknown. restricted by th
less time. The Berkeley SMART code
system has roughly have been ma[OCRerr]
the same efficiency implemented a
characteristics as retrieval systen
SMART. being fitted int
code.
For the data lusion
part, a lookup
No feature Might benefit from procedure to Phrase identifi
recognition a conflator, convert raw scores Word finder. matching; pro[OCRerr]
atures are (eg., thesaurus, to ranks, based on (An onAine identification.
that would company disambiguator, and the training set. concept
[OCRerr][OCRerr]our system? names, the use of many This is necessary for association
geographical other clue types. true routing as database).
locations, opposed to batch
dates, scoring of "routing"
amounts of queries.
money.