SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Proximity-Correlation for Document Ranking: The PARA Group's TREC Experiment
chapter
M. Zimmerman
National Institute of Standards and Technology
Donna K. Harman
-ore docusents delimited by <DOCNO[OCRerr] lines from TREC data
,r first 10 questions on TREC list
z 920527-29
usage; gawk -f TRECscore.Q.l-l0.gawk
[OCRerr]ically will want to do somathing like;
.zcat wsjll9=/=.Z I tr a-z A-Z I gawk -f TRECscore.Q.l-l0.gawk [OCRerr].l-l0.TRECscores.out
eads from stdin and outputs scores for each docusent to stdout
n for;sat:
<CCCNO>] (scorelj (score2l ... scoreb]
ZN I ml = 0;
52 = 0;
53 = 0;
mA = 0;
55 = 0;
56 = 0;
57 = 0;
sB = 0;
59 = 0;
510= 0;
sla = 0;
sib = 0;
slc = 0;
s2a = 0;
s2b = 0;
s2c = 0;
s3a = 0;
s3b = 0;
sIc = 0;
sAs = 0;
sAb = 0;
sAc = 0;
sAd = 0;
555 = 0;
s5b = 0;
sic = 0;
s6a = 0;
s6b = 0;
s6c = 0;
s6d = 0;
s7a = 0;
s7b = 0;
s7c = 0;
s7d = 0;
aBa = 0;
sIb = 0;
sIc = 0;
aId = 0;
s9a = 0;
s9b = 0;
s9c = 0;
slOs = 0;
slOb = 0;
docno = *<null>[OCRerr];
COCNO[OCRerr]I I print[OCRerr] I[OCRerr]%-2Os %Sd [OCRerr]5d %id lid lid lid lid lid lid lid\n[OCRerr],
docno, ml, 52. 53.54, Si. .6. 57, 59. 59, 510);
docno = $2;
51 = 0;
52 = 0;
53 = 0;
TREcscore.Q.O1 -1O=gawk
54 0;
55 0;
.6 0;
s7 = 0;
59 0;
.9 0;
.10= 0;
sia = 0;
sib = 0;
sic = 0;
968 = 0;
s6b = 0;
s6c = 0;
s6d = 0;
875 = 0;
s7b = 0;
s7c = 0;
s7d = 0;
aBa = 0;
s9b = 0;
sBc = 0;
s9d = 0;
s9a = 0;
s9b = 0;
s9c = 0;
slOs = 0;
slOb = 0;
I topic 001 -- - antitrust cases pending
[OCRerr]ANTITRUSTI I ala += 5; )
[OCRerr]CASEI I sib += 5;
I slc += 5;
51 = ala = slb slc;
if (51 > .11 51 = sl;
sla == .9;
slb == .9;
slc == .9;
I topic 002 --- acquisition/mergerletc. involving UB & foreign IcospaniesI
,ACOUI3ZTZ0NIsUYo[OCRerr]IMERCERITAKEOVER/ I s2a += 5;
,USIU\.B\.IAMERICAN/ I s2b += 5; 1
I?0REZGNI I s2c += 5; 1
1 s2 = s2a = s2b = s2c;
if 182 > 521 52 = 82;
s2a == .9;
s2b == .9;