SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Proximity-Correlation for Document Ranking: The PARA Group's TREC Experiment chapter M. Zimmerman National Institute of Standards and Technology Donna K. Harman ~2,oa~In ~O8:42~5~ TRECscore.Q. 1 1-20.gawk score documents delimited by <DOCNO> lines from TREC date for second 10 questions on TREC list =2 920527-29, 0601 usage: gawk -f TRECscore.Q.il-20.gawk typically will want to do something like: *ZcSt wa[OCRerr]Ii9[OCRerr]/ .Z I tr a-z A=Z I gawk -f TREcscore.Q.ii=20.gswk >Q.1l=20.TRECscores.o reads from stdin and outputs scores for each document to stdout in format: l<DOCNO>] iscorel] jscore2] ... jacorelO] GIN I ml 0; ml = 0; ml = 0; mA = 0; mS = 0; m6 = 0; m7 = 0; ml = 0; m9 = 0; miO= 0; sla 0; sib = 0; sic = 0; s2a = 0; s2b = 0; sla = 0; sIb = 0; sIc = 0; 54a 0; sAb = 0; siOc = 0; sAc = 0; sAd 0; S topic 011 -- - space program s5s 0; s5b = 0; /BPACE/ I sia += 5; 1 s5c = 0; /PRCGRAM I PROJECT sib 4= 5;- = 0; /GOAL I PLANI I sic 4= 5; s6b = 0; s6c 0; Si ala sib sic; s7a = 0; if Isi > ml) ml = Si; 570 = 0; sia == .9; s7c = 0; sib == .9; sla = 0; sic == .9; 1 sIb = 0; sIc = 0; I topic 012 --- water pollution s9a = 0; s9b = 0; IWATER/ I a2a += 5; s9c = 0; /POLLt[OCRerr]TION/ I s2b += 5; 1 sios = 0; slOb = 0; slOc = 0; 82 = s2a s2b; docno = *<nuii>=; if (s2 > [OCRerr]I ml = s2; a2a == .9; I,;Doc[OCRerr][OCRerr];,l I printf ([OCRerr]%=20s [OCRerr]5d %5d %5d [OCRerr]5d %5d %5d [OCRerr]5d %5d %5d %5d\n[OCRerr]. s2b == .9; 1 docno. ILl. ml. ml. mA, aS. m6. m7. al. m9, ab); docno = $2; S topic 013 --= Mitsubishi Heavy Industries Ltd. ml = 0; ml = 0; IMITSUBISHI/ I s3a 4= 5; 1 ml = 0; IHEAVYI ( sIb += 5; 1 mA = 0; /INDU$TR/ I sIc 4= 5; 1 m5 0;