Version 8.1, Added infAP, minor bug fixes 7/24/06 Improved infAP comments (implementation verified by Yilmaz). trec_eval_help.c: allow longer measure explanations. 6/27/06 get_opt.c Fixed error message 6/22/06 Added measure infAP (Aslam et al) to allow judging only sample of pools. -1 in qrels file interpreted as pool doc not judged. 6/22/06 trvec_teval.c: fixed bugs in calculation of bpref if multiple relevance levels were used and a non-default relevance level was given. (Eg. A doc with rel level of 2 was counted as unjudged rather than judged nonrel if a relevance level of 3 was needed to consider relevant.) 4/5/06 Changed comments in README, trec_eval.c, trec_eval_help.c files which incorrectly claimed queries with no relevant docs are ignored (this was true with very old versions of trec_eval). Now reads that queries with no relevance information are ignored. Giorgio Di Nunzio and Nicola Ferro, ------------------------------------------------------------------------------ Version 8.0, full bpref bug fix, see file bpref_bug. I decided to up the version number since bpref results are incompatible with previous results (though the changes are small). 11/8/05: Bpref_bug: New file explaining bug and impact (conclusions after rerunning all of SIGIR 2004 bpref paper experiments). 11/5/05: Added new measures: micro_prec, micro_recall, micro_bpref. I thought I had an application for micro_bpref averaging (summing components of measure over all docs (ignoring topics) and then computing measure), but micro_bpref still proved a rotten measure. Left code in case someone ever actually finds an application for valid micro averaging. 11/5/05: Added new measures: old_bpref, old_bpref_top10pRnonrel. These are the old buggy measures included only for backward comparisons. 11/5/05: trvec_teval.c: Broke apart old trvec_trec_eval to calculate different types of measures separately. Very hard to decipher old code (though still difficult with new code) since parts of the calculations for a measure were so far apart. ------------------------------------------------------------------------------ Version 7.4, minor changes from 7.3 11/4/05: trvec_teval.c: fixed bpref bug if very low (< R) numbers of non-rel judgements available (divided by num_nonrel_ret instead of num_nonrel). (pointed out by Ian Soboroff). 11/3/05: trvec_teval.c: bpref_10, bpref_5 had zero division problems if no rel docs were retrieved. (pointed out by Ian Soboroff). 10/23/05: form_trvec.c: Added check for duplicate docno's in results and qrels. (pointed out by Shlomo Geva. Default behavior used to be that duplicate result docno's were always non-rel, but that changed in later versions, so had better test explicitly for it and complain). 10/23/05: README: sample invocation of trec_eval had arguments reversed. (pointed out by Carol Peters). 10/23/05: moved gm_ap to be a major measure (always printed). changed measures.c, test/out*, README ------------------------------------------------------------------------------ Version 7.3, a reasonably major rewrite from earlier versions in terms of internal structure and default output format (now relational), but the input format and measures calculated remain the same (or at least upward compatible).