gov.nist.nlpir.irfapps.hci
Class BibToken

java.lang.Object
  |
  +--gov.nist.nlpir.irfapps.hci.BibToken

public class BibToken
extends java.lang.Object

A class to represent a token in a TROFF Bibliographic (refer) Document.

Version:
$Revision: 1.1 $
Author:
This software was produced by NIST, an agency of the U.S. government, and by statute is not subject to copyright in the United States. Recipients of this software assume all responsibilities associated with its operation, modification and maintenance.

Field Summary
(package private) static int EOF
          Token type value
(package private) static int SEP
          Token type value
(package private) static int TAG
          Token type value
 char tagname
          field abbreviation
 int type
          type of token
(package private) static int UNDEF
          Token type value
 java.lang.String value
          value of token
 
Constructor Summary
BibToken(int t, char name, java.lang.String v)
          Creates new biblio field token using the bibliography field abbreviation character.
 
Method Summary
(package private) static boolean isBlankLine(java.lang.String line)
          Determines if supplied string contains only white space charaters.
static BibToken readBibToken(java.io.PushbackReader f)
          Scans a java PushbackReader of TROFF Bibliographic Documents until a token is recognized, then returns the token.
 void setToken(int t, char name, java.lang.String v)
          Sets BibToken
 java.lang.String toString()
          Returns a string representation of the BibToken object with the token type, tag name, and token value.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, wait, wait, wait
 

Field Detail

TAG

static final int TAG
Token type value

UNDEF

static final int UNDEF
Token type value

SEP

static final int SEP
Token type value

EOF

static final int EOF
Token type value

type

public int type
type of token

tagname

public char tagname
field abbreviation

value

public java.lang.String value
value of token
Constructor Detail

BibToken

public BibToken(int t,
                char name,
                java.lang.String v)
Creates new biblio field token using the bibliography field abbreviation character.
Parameters:
t - id of TOKEN
name - bibliography field abbreviation character
Method Detail

setToken

public void setToken(int t,
                     char name,
                     java.lang.String v)
Sets BibToken
Parameters:
t - id of TOKEN
bibliography - field abbreviation character

toString

public java.lang.String toString()
Returns a string representation of the BibToken object with the token type, tag name, and token value.
Returns:
string representation of BibToken Object
Overrides:
toString in class java.lang.Object

isBlankLine

static boolean isBlankLine(java.lang.String line)
Determines if supplied string contains only white space charaters.
Parameters:
line - string containing one line of text ending in a newline
Returns:
true if line is blank.

readBibToken

public static BibToken readBibToken(java.io.PushbackReader f)
                             throws java.io.IOException
Scans a java PushbackReader of TROFF Bibliographic Documents until a token is recognized, then returns the token.
Parameters:
PushbackReader - containing BIB Documents
Returns:
newly created BibToken.
Throws:
java.io.IOException - thrown if IO error

The function is based on the grammar shown below.

 lambda_ws   -> lambda | WHITESPACE_NO_NL ;
 blankline   -> NEWLINE lambda_ws NEWLINE ;
 eod         -> blankline | eof ;
 HCIdocument -> recordlist eod ;
 recordlist  -> lambda | record recordlist ;
 record      -> tag content 
 record      -> title | section | author | bookname | date |
 pages | copyright | abstract ;
 title       -> "%T" content ;
 section     -> "%S" content ;
 author      -> "%A" content ;
 bookname    -> "%B" content ;
 date        -> "%D" content ;
 pages       -> "%P" content ;
 copyright   -> "%C" content ;
 abstract    -> "%X" content ;
 content       -> STRING | STRING SPACE content ;
terminals are in UPPERCASE

scanner finite state machine

 state  input                output       next state
 -------------------------------------------------------
   0     %                                    1
   0     eof                 separator        end                 
 -------------------------------------------------------
   1     A-Z                 setrectype       2
 -------------------------------------------------------
   2     space                                3
 -------------------------------------------------------
   3     A-Za-z{punct, sp}                    3
   3     newline                              4
   3     eof                 newdoc,pushback  end
 -------------------------------------------------------
   4     A-Za-z{punct, sp}                    3
   4     space                                4
   4     newline             newdoc,pushback  end
   4     eof                 newdoc,pushback  end
   4     %                   newdoc,pushback  end
 -------------------------------------------------------