Experiment:
Given three independent sets of judgments for each of 48 TREC-4 topics
Rank the TREC-4 runs by mean average precision as evaluated using different combinations of judgments
Compute correlation among run rankings
Previous slide
Next slide
Back to first slide
View graphic version