SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
CLARIT TREC Design, Experiments, and Results
chapter
D. Evans
R. Lefferts
G. Grefenstette
S. Handerson
W. Hersh
A. Archbold
National Institute of Standards and Technology
Donna K. Harman
Formating
Text Prep
T
NLP
Morph
I
Parsing
I
Candidate NPs
Term/Doc
Statistics [OCRerr]NP Scoring -
[OCRerr] Matching
Thesaurus
Scoring -
Filtering
Lexicon
`Core' Lex (100,000 items)
Optional Su[OCRerr]Domain Lex
`Heuristic' Grammar
`Simplex' NP
Optional "Complex" NP
Optional "Full Sentence" Constituents
General Set of Terms
"Exact"
"Novel"
"General"
Specific
Set of Terms
"1s[OCRerr]Order" Thesaurus: Flat List of Terms,
Implicit Compositional, Hierarchical Structure
Figure 1: `St[OCRerr]d&d' CLARIT Indexing Overview
Formating
[~w
Text Prep
NLP
Scoring
I
Morph
I
Parsing
I
Candidate NPs
Lexicon
`Core' Lex (100,000 items)
`Heuristic' Grammar
`Simplex' NP
Term/Doc_________ I
{ NPs & Words{ General Set of Terms
Statistics
Figure 2: Modified CLARIT Indexing in TREC
253