Baseline Evaluation

Split TDT2 topic 89 into dev/test

For comparison, developed simple sentence-extraction multidoc summarizer (uses doc position, timeliness, clustering)

Also compared against two human authors

Each system/author produced four summaries of different lengths and emphases

All summaries graded on A-F scale by four grad students / professionals; component scores given for content, organization, readability

Previous slide Next slide Back to first slide View graphic version