SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
Incorporating Semantics Within a Connectionist Model and a Vector Processing Model
chapter
R. Boyd
J. Driscoll
National Institute of Standards and Technology
D. K. Harman
word idfof the word
Iog10[OCRerr]Nf
Document#1
L[OCRerr]comotives pull the trains. and .6
canopy .6
carry .6
Document #2 depart undefined
do undefined
People meet people under the canopy and within trains. freight .6
from .6
hourly .6
Document #3 leave .6
locomotives .6
Trains carry freight from the station. meet .6
noon .6
people .6
Document #4 pull .6
station .3
Trains leave the station hourly until noon. the 0
trains 0
under .6
Ouery until .6
when undefined
When do trains depart the station? within .6
Figure 6. Four Documents and a Query. Figure 8. The i[OCRerr]of Each Word.
word frequency category probability
depart 1 AMDR 1/4
word number of documents TA[OCRerr] 1/8
the word is in (dJ)
do 1 AUSE 1/21
ATh{P 1/21
and 1
TCSE 1/21
canopy 1
TCNV 2/21
carry 1 ThEs 1/21
do 0 TSRC 1/21
depart 0
freight 1 station 1 APOS 3/16
from 1 AORD 1/8
hourly 1 TAMF 1/16
leave 1 TCND 1/8
locomotives 1 TDGR 1/16
meet 1 TSPL 3/16
noon 1
people 1 the 1
pull 1 trains 1 AORD 7/24
station 2 AMDR 1/12
the 4 AMFR 1/12
trains 4 TACM 1/24
under 1 TCNV 1/12
until 1
when 0 when 1 TAMT 1/3
within 1 TnM 2/3
Figure 7. Ust of Words in the Documents and Query. Figure 9. Words in the Query.
296