SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
TIPSTER Panel -- DR LINK's Linguistic-Conceptual Approach to Document Detection
chapter
E. Liddy
S. Myaeng
National Institute of Standards and Technology
Donna K. Harman
Venezuela and its creditor banks agreed yesterday to restructure $3 bmion of
its foreign debt
the verb `agree' triggers a set of case frames in the knowledge base with the additional
informaUon provided by the preposition `to' followed by another verb, the correct case frame
(i.e. RRF for case relaUons) will be chosen:
(((A X S) ((AT ? to+V) (CT ? that)))
where the first element of each of the two list indicates the relation (e.g. A for AGENT, AT for
ACTIVITY, and CT for CONTENT) the second specifies the LDOCE semantic restriction (e.g. X for
`abstract or human), and the last shows the syntactic or lexical clue that must be satisfied in
order for the case frame to be instantiated. In this example, S is for a sublect, to+V for an
infinitive, and `that' for a relative clause. The two sub-lists, (AT ? to+V) (CT ? that), indicates
that only one of them is triggered at a given Ume. Eventually the following set of triples are
generated:
[agree] -> (A) -[OCRerr] [country: Venezuela]
[agree] -> (A) -> [creditoLbank]
[agree] -> (AT) -> [restructure]
where the concept node [country: Venezuela] is formed by applying rules from the Proper Noun
knowledge base and processor.
2.e. Conceptual Graoh Generator
Given the triples generated in the RCD module by applying various types of RRF, the next step is
to merge them to form a complete CG for a sentence, paragraph, or an SRF component. The
resulting CGs consist of not only concepts and relaUons but also instantiations or referents of
concepts, which are usually derived from proper nouns like company names but can be another
CG derived from a nested clause, thereby albwing for nested CGs. Since the different types of
RRF in the knowledge base are developed and applied in the RCD modules in a more or less
independent manner, a form of conflict resolution is necessary in this component.
Given the set of concept[OCRerr]relation-concept triples as shown above and another set derived from
the verb `restructure' with a frame ((A X S) (P W 0)):
[restructure] -> (A) -> [country: Venezuela]
[restructure] -> (A) -> [creditor_bank]
[restructure] -[OCRerr] (P) -[OCRerr] [debt]
where A and P are for `agent' and `patient' relations, and
[debt] -[OCRerr] (ME) -[OCRerr] [money]
[debt] -> (CH) -> [foreign]
are produced by a special handler within the RCD, where ME and CH stand for `measure' and
`characteristics', the CG generator produces the following CG for the sentence:
123