SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
CLARIT TREC Design, Experiments, and Results
chapter
D. Evans
R. Lefferts
G. Grefenstette
S. Handerson
W. Hersh
A. Archbold
National Institute of Standards and Technology
Donna K. Harman
Step: Input: Process: Output:
1 Document(s) NLP TermsD0[OCRerr]
"Parsed-Doc"
2 Topic(s) NLP TermsT0[OCRerr]
3
Hand Filter:
TermsT0[OCRerr] 1. "Eliminate" Weighted-TermsT0[OCRerr]
2. "Weight"-3/2/1
"Source- Query"
Figure 6: Schematic R[OCRerr]presentation of Data Preparation
WS3891102-0187
"ucdersott" na Rcdermott
"international" adj international
"inc." ukw? inc.
"sajd" vt-past say vt-pastprt say
"its" gen its
"babcock" na babcock
"\&" *and* and
"wilcox" na wilcox
"unit" Sn unit
"coRpleted" vt-past couplete vt-pastprt couplete
"the" d the
"sale" sn sale sn sell
"of" prep of
"its" gen its
"bailey" sn bailey
"controls" vt-pressg3 control pn control vt-pressg3 control
"operations" pn operation
"of" prep of
"about" prep about
"$370" ukw? $370
"uillion" quant uillion
"\." *period* \.
Figure 7: Sample of Data-Document After Morphological Analysis
259