SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
DR-LINK: A System Update for TREC-2
chapter
E. Liddy
S. Myaeng
National Institute of Standards and Technology
D. K. Harman
knowledge base contahibig 13,786 case frames we have constructed, each of which prescribes a pattern involving a
verb and the cOrTe[OCRerr]onding concept-relation[OCRerr]oncept triples.
Given a set of case frames for different senses of[OCRerr]decline', for example,
(decline 1 ([OCRerr]ATJENT subject? obligatory)))
(decline 2 ((AGENT subject human obligatory)
[OCRerr]ATJEET object? optional)))
(decline 3 ((AGENT subject human obligatory)
(ACIwnY infInitive ? obligatory)
(link infmitive subject AGENT)))
AGENT, PAIJENT, and ACIwnY are the relations that connect the verb to other constituents. The second
components (e.g. subject) prescribe the syntactic categories of the constituents and the third components (e.g.
human) semantic restrictions that the subject and object should satisfy. The last components (e.g. obligatory)
indicate whether the constituent must exist in a sentence in order for the particular case frame to be instantiated. The
last line of the third case frame instructs the CF handler to link the subject to the infrnitive verb with the AGENT
relation. This kind of linklng instructions allow the CF handler to produce triples containing non-verbal
constituents.
The CF Handler selects the best case frame by attempting to instantiate each case frame and determine which one is
satisfied most by the sentence at hand. This can be seen as a sense disambiguation process using both syntactic and
semantic information. The semantic restriction information contained in the case frames were obtained from
IŁ)OCF, and when the sentence is processed, the CF handler also consults IDOCF to get semantic restriction
information for individual constituents surrounding the verb in the sentence and compares it with the restrictions in
the case frames of the verb as a way to determine which case frame is likely to be the correct one.
With the following sentence fragment,
the chairman declined to elaborate on the disclosure...
the CF handier chooses the third case frame and produces
[decline] -> (AGENT) -> [chairman]
[decline] -> (ACIwIFY) -> [elaborate]
[elaborate] -> (AGENT) -> [chairman]
In the current implementation, the input text to the CF handler is first tagged with part-of-spoech information and
bracketed for constituent boundaries. BBN's POST tagger [OCRerr]etter et al., 1991) has been used to attach a
part-of-spoech tag to individual words. The constituent boundary bracketer we developed then marks boundaries of
grammatical constituents such as infinitives, noun phrases, prepositional phrases, clauses, etc.
At the time of writing, the case frame knowledge base contains 13,786 case frames, of which 13,444 are for all the
verb entries (5,206) in I1)OCE, and the rest are for 342 verbs that appear in the Wall Street Journal collectionbut
are not in the IDOCE as a headword. While we have constructed case frames for most of the plirasal verbs in
u)OCE, the capability of processing plirasal verbs has not
been implemented in the current CF Handler.
2.1.2. Nominalized Verb (NV) Handler
The nominalized verb handler has been implemented for the DR-LINK system we ran for the TJPSTER 24th month
evaluation. Its main function is to consult the NV case frames to identify a NV in a sentence and create
94