SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Proximity-Correlation for Document Ranking: The PARA Group's TREC Experiment chapter M. Zimmerman National Institute of Standards and Technology Donna K. Harman -ore docusents delimited by <DOCNO[OCRerr] lines from TREC data ,r first 10 questions on TREC list z 920527-29 usage; gawk -f TRECscore.Q.l-l0.gawk [OCRerr]ically will want to do somathing like; .zcat wsjll9=/=.Z I tr a-z A-Z I gawk -f TRECscore.Q.l-l0.gawk [OCRerr].l-l0.TRECscores.out eads from stdin and outputs scores for each docusent to stdout n for;sat: <CCCNO>] (scorelj (score2l ... scoreb] ZN I ml = 0; 52 = 0; 53 = 0; mA = 0; 55 = 0; 56 = 0; 57 = 0; sB = 0; 59 = 0; 510= 0; sla = 0; sib = 0; slc = 0; s2a = 0; s2b = 0; s2c = 0; s3a = 0; s3b = 0; sIc = 0; sAs = 0; sAb = 0; sAc = 0; sAd = 0; 555 = 0; s5b = 0; sic = 0; s6a = 0; s6b = 0; s6c = 0; s6d = 0; s7a = 0; s7b = 0; s7c = 0; s7d = 0; aBa = 0; sIb = 0; sIc = 0; aId = 0; s9a = 0; s9b = 0; s9c = 0; slOs = 0; slOb = 0; docno = *<null>[OCRerr]; COCNO[OCRerr]I I print[OCRerr] I[OCRerr]%-2Os %Sd [OCRerr]5d %id lid lid lid lid lid lid lid\n[OCRerr], docno, ml, 52. 53.54, Si. .6. 57, 59. 59, 510); docno = $2; 51 = 0; 52 = 0; 53 = 0; TREcscore.Q.O1 -1O=gawk 54 0; 55 0; .6 0; s7 = 0; 59 0; .9 0; .10= 0; sia = 0; sib = 0; sic = 0; 968 = 0; s6b = 0; s6c = 0; s6d = 0; 875 = 0; s7b = 0; s7c = 0; s7d = 0; aBa = 0; s9b = 0; sBc = 0; s9d = 0; s9a = 0; s9b = 0; s9c = 0; slOs = 0; slOb = 0; I topic 001 -- - antitrust cases pending [OCRerr]ANTITRUSTI I ala += 5; ) [OCRerr]CASEI I sib += 5; I slc += 5; 51 = ala = slb slc; if (51 > .11 51 = sl; sla == .9; slb == .9; slc == .9; I topic 002 --- acquisition/mergerletc. involving UB & foreign IcospaniesI ,ACOUI3ZTZ0NIsUYo[OCRerr]IMERCERITAKEOVER/ I s2a += 5; ,USIU\.B\.IAMERICAN/ I s2b += 5; 1 I?0REZGNI I s2c += 5; 1 1 s2 = s2a = s2b = s2c; if 182 > 521 52 = 82; s2a == .9; s2b == .9;