TIPSTER Architecture Change Request Title: Pattern Specification Language Page 1 of ? Date Prepared: 16 March 1998 CR No. 10 Priority: Routine Date Logged Document Affected: Design Document Version: 2.3 Paragraphs Affected: Section 8.3 (new) and Appendices B & C References: None Change Required: New Section under Extraction Specific Recommendations: Modify Architecture Design document pages by adding/replacing affected sections and appendices with new material as provided. Reason for the Proposed Change: Under the current Architecture, Extraction has had minimal discussion and specification. One of the problems that had been identified was the lack of a method to allow developers to express and exchange rules which identify patterns necessary for extraction. An initial solution to this problem has been the design of a Pattern Specification Language. This Language will become part of the Architecture. ---------------------------------------------------------------------------------------------------- Pending final material from the TWG for the Pattern Specification Language, a prototype YACC grammar developed by Mr. Cowie and Mr. Appelt has been placed herein as a placeholder. %{ /* TIPSTER Common Pattern Specification Language Grammar Syntax by Jim Cowie, CRL, NMSU Semantics and minor mods by Doug Appelt, SRI International */ %} %union { Int32 intval ; char* string ; void* pointer ; short* iarray ; struct MatchSpec* mspec ; struct KleeneOpGroup* kspec ; struct RuleRecord* rspec ; } %token SYMBOL %token RIGHT_ARROW %token DOUBLE_ARROW %token LEFT_ANGLE_BRA %token RIGHT_ANGLE_BRA %token DOUBLE_LEFT_BRA %token DOUBLE_RIGHT_BRA %token LEFT_SET_BRA %token RIGHT_SET_BRA %token LEFT_BRA %token RIGHT_BRA %token LEFT_SQUARE_BRA %token RIGHT_SQUARE_BRA %token AMPERSAND %token BAR %token NUMBER %token MINUS %token QUOTED_STRING %token ANY %token TEMP %token NORM %token IF %token THEN %token ELSE %token STAR %token PLUS %token QUESTION %token COLON %token PLUS_COLON %token SEMICOLON %token DOUBLE_SEMICOLON %token ASSIGN %token ADD_ASSIGN %token COMMA %token STOP %token EQUAL %token NOT_EQUAL %token LESS_OR_EQUAL %token GT_OR_EQUAL %token CARAT %token RULE %token PHASE %token PRIORITY %token TK_INPUT %token TK_TRUE %token TK_FALSE %token ATSIGN %type a_c_expression %type a_constraint %type action %type actions %type action_exp %type anno %type anno_type %type arbitrary %type arglist %type assignment %type attr_name %type basic_pattern_element %type binding %type boolean_op %type c_expression %type constraint %type constraint_group %type constraints %type decl %type declarations %type field %type function_call %type function_name %type index %type index_expression %type index_op %type input_token %type kleene_op %type macro %type macros %type macro_header %type more_actions %type namedecl %type param_list %type pattern_element %type pattern_elements %type phase %type post_condition %type pre_condition %type prioritydecl %type rule %type rules %type something_or_other %type test_op %type value %% phase: macros declarations rules macros: | macros macro macro: macro_header DOUBLE_ARROW arbitrary RIGHT_ARROW arbitrary DOUBLE_SEMICOLON macro_header: SYMBOL DOUBLE_LEFT_BRA param_list DOUBLE_RIGHT_BRA arbitrary: | arbitrary something_or_other something_or_other: SYMBOL | LEFT_ANGLE_BRA | RIGHT_ANGLE_BRA | DOUBLE_LEFT_BRA | DOUBLE_RIGHT_BRA | LEFT_SET_BRA | RIGHT_SET_BRA | LEFT_BRA | RIGHT_BRA | LEFT_SQUARE_BRA | RIGHT_SQUARE_BRA | AMPERSAND | BAR | NUMBER | MINUS | QUOTED_STRING | ANY | TEMP | NORM | IF | THEN | ELSE | STAR | PLUS | QUESTION | COLON | SEMICOLON | PLUS_COLON | ASSIGN | ADD_ASSIGN | COMMA | STOP | EQUAL | NOT_EQUAL | LESS_OR_EQUAL | GT_OR_EQUAL | CARAT | TK_TRUE | TK_FALSE | ATSIGN declarations: | declarations decl decl: PHASE COLON SYMBOL | input_token COLON symbol_list input_token: TK_INPUT symbol_list: SYMBOL | symbol_list COMMA SYMBOL param_list: SYMBOL | param_list SEMICOLON SYMBOL rules: | rules rule | rules error rule rule: namedecl constraints RIGHT_ARROW actions | namedecl prioritydecl constraints RIGHT_ARROW actions namedecl: RULE COLON SYMBOL prioritydecl: PRIORITY COLON NUMBER constraints: pre_condition constraint_group post_condition pre_condition: | LEFT_ANGLE_BRA constraint_group RIGHT_ANGLE_BRA post_condition: | LEFT_ANGLE_BRA constraint_group RIGHT_ANGLE_BRA constraint_group: pattern_elements BAR constraint_group | pattern_elements pattern_elements: pattern_element | pattern_element pattern_elements pattern_element: basic_pattern_element | LEFT_BRA constraint_group RIGHT_BRA kleene_op binding | LEFT_BRA constraint_group RIGHT_BRA | LEFT_BRA constraint_group RIGHT_BRA kleene_op | LEFT_BRA constraint_group RIGHT_BRA binding kleene_op: STAR | PLUS | QUESTION binding: index_op index basic_pattern_element: LEFT_SET_BRA c_expression RIGHT_SET_BRA | QUOTED_STRING | SYMBOL | function_name LEFT_SQUARE_BRA RIGHT_SQUARE_BRA c_expression: constraint | constraint COMMA c_expression index_op: COLON | PLUS_COLON index: NUMBER | SYMBOL constraint: anno test_op value | anno_type anno: anno_type STOP attr_name anno_type: SYMBOL | ANY attr_name: SYMBOL test_op: EQUAL | NOT_EQUAL | LEFT_ANGLE_BRA | RIGHT_ANGLE_BRA | LESS_OR_EQUAL | GT_OR_EQUAL value: NUMBER | QUOTED_STRING | SYMBOL | TK_TRUE | TK_FALSE actions: action_exp more_actions | action_exp more_actions: COMMA action_exp more_actions | COMMA action_exp action_exp: LEFT_BRA IF a_c_expression THEN actions RIGHT_BRA | LEFT_BRA IF a_c_expression THEN actions ELSE actions RIGHT_BRA | action {$$ = $1 ;} a_c_expression: a_constraint | a_c_expression boolean_op a_constraint boolean_op: AMPERSAND | BAR a_constraint: index_expression test_op value action: assignment | function_call assignment: index_expression ASSIGN ATSIGN | index_expression ASSIGN value | index_expression ASSIGN index_expression | index_expression ASSIGN function_call | index_expression ADD_ASSIGN value | index_expression ADD_ASSIGN index_expression | index_expression ADD_ASSIGN function_call index_expression: COLON index field field: STOP anno_type STOP attr_name | STOP anno_type function_call: function_name LEFT_SQUARE_BRA arglist RIGHT_SQUARE_BRA function_name: SYMBOL arglist: index_expression | CARAT index_expression | value | value COMMA arglist | index_expression COMMA arglist | CARAT index_expression COMMA arglist %%