Return to DUC Homepage


Procedure for human comparison of model (reference) and peer (system-generated and other) abstracts

  • For each document set (in randomized order):
    • For each summary type (probably in fixed order of increasing compression: single-document, 200-, 100-, 50-, and 10-word multi-document summary (abstract)):
      • For each peer summary (in randomized order) - composed of peer units (PUs), which will be sentences:
        • If the peer target size is greater than 10, the evaluator reads the peer summary and then makes overall judgments as to the peer summary's quality, independent of the model. Questions 1-5 are within-sentence judgments; the rest are within- or across-sentence judgments. The answers are chosen in every case from the following set of 4 ordered categories: {0, 1-5, 6-10, more than 10}

          1. About how many gross capitalization errors are there?
          2. 	Examples: 
            	- [new sentence] the new drugs proved beneficial.
          3. About how many sentences have incorrect word order?
               	- John before Mary the park visited
          5. About how many times does the subject fail to agree in number with the verb?
          6.    	Examples:
               	- The student see a teacher with a telescope
            	- It was clear, that the reporters agrees with the idea. 
          7. About how many of the sentences are missing important components (e.g. the subject, main verb, direct object, modifier) - causing the sentence to be ungrammatical, unclear, or misleading?
               	- The agreement, signed in London yesterday, .
               	- Mr. Smith to Washington where he met the senator.
               	- The exchange rate is %.
               	- Stewart, the builder.
          9. About many times are unrelated fragments joined into one sentence?
          10. 	Examples:
            	- They run a refinery; two apples would be enough.
          11. About how many times are articles (a, an, the) missing or used incorrectly?
          12.   	Examples:
               	- Men saw woman with the telescope
               	- He picked up a book.  A book looked interesting.
               	- El Paso owns and operates refinery.
          13. About how many pronouns are there whose antecedents are incorrect, unclear, missing, or come only later?
          14.    	Examples:
               	- [opening sentence] Their agreement was signed in Oslo in 1933.
               	- Many Presidents were targets of assassins.  Pres. Reagan was
                      wounded.  He was shot in the Ford's Theater in 1865.
          15. For about how many nouns is it impossible to determine clearly who or what they refer to?
          16.    	Examples:
            	- The company agreed to negotiate.  [Which company?]
          17. About how times should a noun or noun phrase have been replaced with a pronoun?
          18.    	Examples: 
                   	- Mr. John Smith went to DC.  Mr. John Smith saw the senator.
          19. About how many dangling conjunctions are there ("and", "however"...)?
          20.    	Examples: 
               	- [opening sentence]  However, they came to a good agreement.
          21. About many instances of unnecessarily repeated information are there?
          22. 	Examples: 
            	- Yesterday's estimate does not include any projection for claims in
            	Louisiana, which was also affected by the storm, although less
            	severely than Florida.  But on the Florida losses alone, Hurricane
            	Andrew becomes the most costly insured catastrophe in the US. 
            	Louisiana was also affected by the storm. With Florida's Hurricane 
            	Andrew losses added in, the total rises to Dollars 11.2bn. This
            	does not include claims in Louisiana.
          23. About how many sentences strike you as being in the wrong place because they indicate a strange time sequence, suggest a wrong cause-effect relationship, or just don't fit in topically with neighboring sentences?
        • View the model summary - composed of model units (MUs), which are  human-corrected chunks of a type to be determined
        • Evaluator steps through the MUs.  For each MU s/he:
          • marks any/all PU(s) sharing content with the current MU
          • indicates whether the marked PUs, taken together, express about 0%, 20%, 40%, 60%, 80%, or 100% of the content in the current MU.
        • Evaluator reviews unmarked PUs and indicates once for the entire peer summary that:
          • About 0%, 20%, 40%, 60%, 80%, or 100% of the unmarked PUs are related but needn't be included in the model summary
      • (Evaluators will be allowed to review and revise earlier peer summary judgments before moving to the next document set - to mitigate learning effects.)

For data, past results, mailing list or other general information
contact: Lori Buckland (
For other questions contact: Paul Over (
Last updated: Monday, 02-Dec-2002 11:18:25 MST
Date created: Friday, 26-July-02