Accuracy on Document Input
Input: 1 Document Output: Topic labels
Measured against Primary Source Media human annotation
- Top choice unsupervised label matches human label 46% of the time (but this is too harsh)
- Top choice label seems reasonable label 69% of the time (but this is subjective)
Top few labels are understandable most of the time (but this requires user evaluation)
For Document Understanding Conference (DUC)
- Apply topic labeling to folders (rather than to documents)