SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Proximity-Correlation for Document Ranking: The PARA Group's TREC Experiment
chapter
M. Zimmerman
National Institute of Standards and Technology
Donna K. Harman
;92'Ow11
TRECSCOre.C!.21 -30.gawk
score documents delimited by <DOCNO> lines from TREC date
for third 10 questions on TREC list
=z 920527-29, 0601, 0602
usage: gawk -f TREcecore.Q,21-l0.gawk
typically will want to do something like:
zcat wsjIl9=/= .Z I tr a-z A-Z gawk -f ThEcecore.Q.21-30.gewk >Q.21-30.TRECecoree.o
:t
((maybe prefixed by nohup)
then typically will want to do something like:
sort -n +1 TREcscores.out I tail -1000 ;`021.best
this program reads from stdin and outputs scores for each document to stdout
in format:
(;DOCNO>1 Iscorel] [ecore2] . .. [scoreb]
9ECIN I ml -
ml
ml
m4
m5
m6
m7
m9
ml
el
al -
el
52
51
53
53
CA
e4
84
55
sS -
s5c = 0:
s6a = 0:
e6b = 0:
e6c = 0:
a7e = 0:
e7b = 0:
aBa = 0:
eBb = 0:
aBc = 0:
a9a = 0:
s9b = 0:
s9c = 0:
s9d = 0:
slOe = 0:
slOb 0:
docno = `[OCRerr]null[OCRerr]=:
[OCRerr]<DO[OCRerr]O>/ ( printf )=%-20a [OCRerr]5d %5d %5d %Sd %5d %5d %Sd %5d %Sd [OCRerr]5d[OCRerr]n=, \
docno, ml, ml. ml. .4, mS, .6, .7. .8, m9, .10):
docno = $2:
ml = 0:
ml = 0:
ml = 0:
mA = 0:
mS = 0;
mE = 0;
m7 = 0;
mB = 0:
m9 = 0:
mlO= 0;
ala = 0;
elb = 0:
slc = 0;
ala = 0:
slb = 0:
ala = 0:
sIb = 0;
sIc = 0:
54a = 0;
cAb = 0;
sAc = 0;
555 = 0:
a5b = 0;
aSt = 0;
a6a = 0:
s6b = 0:
s6c = 0:
s7a = 0;
s7b = 0:
aBa = 0;
eBb = 0;
sBc = 0:
s9a = 0:
a9b = 0:
s9c = 0:
a9d = 0;
slOe = 0:
slOb = 0;
B topic 021 --- superconductivity breakthrough with commercial application
[OCRerr]SUPERCONDUCT/ ( ala += 5:
,DZSCOVER)BREAKT[OCRerr]RI[OCRerr]IRSTI[OCRerr][OCRerr]ANCl I alb += 5:
ICO[OCRerr]ERC)AppLI)PRACTIC/ ( dc += 5:
sl = ale slb sOc:
if (81 [OCRerr] ml) ml = sO:
ala `= .9:
slb == .9:
sOc == .9:
a topic 022 -- - counternarcotics
IDRUG(R[OCRerr]COTIC)CCCAINE)HEROIN)OPIUM)[OCRerr]IJUAPB:1 ( ala += 5:
IZ[OCRerr][OCRerr])SMUCGL)CARTEL)TRAFFICI ( alb += 5:
82 ale = slb:
if (52 > ml ml = al:
s2s == .9:
slb =,= .9:
I topic 021 --- legal repercussions of agrochamical use
{ ala += 5:
[OCRerr] ( sIb += 5:
[OCRerr]w[OCRerr]g[OCRerr]sE(vzCTIM(AcCxDENT(TR[OCRerr]EDY(TRAGIC(DZ$BSTER/ ( sIc == 5;