Validity of TREC Collections
Consistency
- assessor judgments do differ
- but comparative evaluation is stable
Completeness
- some relevant documents are not in pools
- topics with many relevant tend to have even more
- systems that don’t contribute to pool can still be fairly evaluated
- lack of bias in pools is crucial