Nancy A. Chinchor
Science Applications International Corporation
10260 Campus Pt. Dr.
San Diego, CA 92121
The <ENAMEX TYPE="LOCATION">U.K.</ENAMEX> satellite television broadcaster said its subscriber base grew <NUMEX TYPE="PERCENT">17.5 percent</NUMEX> during <TIMEX TYPE="DATE">the past year</TIMEX> to 5.35 million
The Named Entity task was carried out in Chinese and Japanese (MET-2) concurrently with English (MUC-7).
*The U.K. satellite television broadcaster* said **its* subscriber base* grew 17.5 percent during the past year to *5.35 million*
The coreference task is a bridge between the NE task and the TE task.
All aliases are put in the NAME slot. Persons, organizations, artifacts, and locations are all TYPEs of Template Elements. All substantial descriptors used in the text appear in the DESCRIPTOR slot. The CATEGORY slot contains categories dependent on the element involved: persons can be civilian, military, or other; organizations can be government, company, or other; artifacts are limited to vehicles and can be for traveling on land, water, or in air; locations can be city, province, country, region, body of water, airport, or unknown. An example of a Template Element from MUC-7 follows:
<ENTITY-9602040136-11> :=
ENT_NAME: "Dennis Gillespie"
ENT_TYPE: PERSON
ENT_DESCRIPTOR: "Capt."
/ "the commander of Carrier Air Wing 11"
ENT_CATEGORY: PER_MIL
In MUC-7, we limited TR to relationships with organizations: employee_of, product_of, location_of. However, the task is easily expandable to all logical combinations and relations between entity typesAn example of Template Relations from MUC-7 follows:
<EMPLOYEE_OF-9602040136-5> :=
PERSON: <ENTITY-9602040136-11>
ORGANIZATION: <ENTITY-9602040136-1>
<ENTITY-9602040136-11> :=
ENT_NAME: "Dennis Gillespie"
ENT_TYPE: PERSON
ENT_DESCRIPTOR: "Capt."
/ "the commander of Carrier Air Wing 11"
ENT_CATEGORY: PER_MIL
<ENTITY-9602040136-1> :=
ENT_NAME: "NAVY"
ENT_TYPE: ORGANIZATION
ENT_CATEGORY: ORG_GOVT
The task definition for ST required relevancy and fill rules. The choice of the domain was dependent to some extent on the evaluation epoch. The structure of the template and the task definition tended to be dependent on the author of the task, but the richness of the templates also served to illustrated the utility of information extraction to users most effectively.
The filling of the slots in the scenario template was generally a difficult task for systems and a relatively large effort was required to produce ground truth. Reasonable agreement(>80%) between annotators was possible, but required sometimes ornate refinement of the task definition based on the data encountered.
In MUC-7, there were more international sites participating than ever before. The papers reflect interesting observations by system developers who were non-native speakers of the language of their system and system developers who were native speakers of the language of their system.
In MUC-7, more data was provided for training and dry run and it was maintained through all of the updates to the guidelines during the evaluation cycle. The markup will be publicly available on the MUC website at http:\\www.muc.saic.com in the form of offsets from the beginning of each document. The rights to the documents themselves can be purchased from the Linguistic Data Consortium (LDC).
The task definitions for MUC-7 were improved by having authors other than the original authors revise each of the guidelines for internal consistency and to dovetail into the other tasks evaluated. The communal effort in polishing the guidelines and the data markup noticeably improved the evaluation..
Evaluation/ Tasks |
Named Entity | Coreference | Template Element | Template Relation | Scenario Template | Multilingual |
MUC-3 | YES | |||||
MUC-4 | YES | |||||
MUC-5 | YES | YES | ||||
MUC-6 | YES | YES | YES | YES | ||
MUC-7 | YES | YES | YES | YES | YES | |
MET-1 | YES | YES | ||||
MET-2 | YES | YES |
Evaluation/ Tasks |
Named Entity | Coreference | Template Element | Template Relation | Scenario Template | Multilingual |
MUC-3 | R < 50% P < 70% |
|||||
MUC-4 | F < 56% | |||||
MUC-5 | EJV F < 53% EME F < 50% |
JJV F < 64% JME F < 57% |
||||
MUC-6 | F < 97% | R < 63% P < 72% |
F < 80% | F < 57% | ||
MUC-7 | F < 94% | F < 62% | F < 87% | F < 76% | F < 51% | |
Multilingual | ||||||
MET-1 | C F < 85% J F < 93% S F < 94% |
|||||
MET-2 | C F < 91% J F < 87% |
Legend:
R = Recall P = Precision F = F-Measure with Recall and Precision Weighted EquallyE = English C = Chinese J = Japanese S = Spanish
JV = Joint Venture ME = Microelectronics
The appendices to the proceedings contain test materials and other supporting materials that augment the papers in the proceedings. For each of the tasks, a walkthrough article was chosen to allow all of the sites participating in that task to discuss their system response for a common article. The walkthrough articles and the answer keys for each task appear first.
Following the walkthroughs are the formal task definitions provided to all of the sites participating. The datasets discussed were all marked up by human annotators following these guidelines. Next are the score reports output by the automatic scorer which compared the system responses on each task to the human generated answer keys for the formal run test articles. The statistical results represent the significance groupings of the sites for each task based on an approximate randomization algorithm run on the document-by-document scores for each pair of sites. For Named Entity in English, the human annotators' scores are given and included in the statistical significance testing because the systems can achieve scores that are close to human performance. The annotators were significantly better than the systems. Finally, there is the User's Manual for the automated scorer which is in the public domain.