SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Proximity-Correlation for Document Ranking: The PARA Group's TREC Experiment chapter M. Zimmerman National Institute of Standards and Technology Donna K. Harman ;92'Ow11 TRECSCOre.C!.21 -30.gawk score documents delimited by <DOCNO> lines from TREC date for third 10 questions on TREC list =z 920527-29, 0601, 0602 usage: gawk -f TREcecore.Q,21-l0.gawk typically will want to do something like: zcat wsjIl9=/= .Z I tr a-z A-Z gawk -f ThEcecore.Q.21-30.gewk >Q.21-30.TRECecoree.o :t ((maybe prefixed by nohup) then typically will want to do something like: sort -n +1 TREcscores.out I tail -1000 ;`021.best this program reads from stdin and outputs scores for each document to stdout in format: (;DOCNO>1 Iscorel] [ecore2] . .. [scoreb] 9ECIN I ml - ml ml m4 m5 m6 m7 m9 ml el al - el 52 51 53 53 CA e4 84 55 sS - s5c = 0: s6a = 0: e6b = 0: e6c = 0: a7e = 0: e7b = 0: aBa = 0: eBb = 0: aBc = 0: a9a = 0: s9b = 0: s9c = 0: s9d = 0: slOe = 0: slOb 0: docno = `[OCRerr]null[OCRerr]=: [OCRerr]<DO[OCRerr]O>/ ( printf )=%-20a [OCRerr]5d %5d %5d %Sd %5d %5d %Sd %5d %Sd [OCRerr]5d[OCRerr]n=, \ docno, ml, ml. ml. .4, mS, .6, .7. .8, m9, .10): docno = $2: ml = 0: ml = 0: ml = 0: mA = 0: mS = 0; mE = 0; m7 = 0; mB = 0: m9 = 0: mlO= 0; ala = 0; elb = 0: slc = 0; ala = 0: slb = 0: ala = 0: sIb = 0; sIc = 0: 54a = 0; cAb = 0; sAc = 0; 555 = 0: a5b = 0; aSt = 0; a6a = 0: s6b = 0: s6c = 0: s7a = 0; s7b = 0: aBa = 0; eBb = 0; sBc = 0: s9a = 0: a9b = 0: s9c = 0: a9d = 0; slOe = 0: slOb = 0; B topic 021 --- superconductivity breakthrough with commercial application [OCRerr]SUPERCONDUCT/ ( ala += 5: ,DZSCOVER)BREAKT[OCRerr]RI[OCRerr]IRSTI[OCRerr][OCRerr]ANCl I alb += 5: ICO[OCRerr]ERC)AppLI)PRACTIC/ ( dc += 5: sl = ale slb sOc: if (81 [OCRerr] ml) ml = sO: ala `= .9: slb == .9: sOc == .9: a topic 022 -- - counternarcotics IDRUG(R[OCRerr]COTIC)CCCAINE)HEROIN)OPIUM)[OCRerr]IJUAPB:1 ( ala += 5: IZ[OCRerr][OCRerr])SMUCGL)CARTEL)TRAFFICI ( alb += 5: 82 ale = slb: if (52 > ml ml = al: s2s == .9: slb =,= .9: I topic 021 --- legal repercussions of agrochamical use { ala += 5: [OCRerr] ( sIb += 5: [OCRerr]w[OCRerr]g[OCRerr]sE(vzCTIM(AcCxDENT(TR[OCRerr]EDY(TRAGIC(DZ$BSTER/ ( sIc == 5;