gov.nist.nlpir.irfapps
Class SampleApp

java.lang.Object
  |
  +--gov.nist.nlpir.irfapps.SampleApp

public class SampleApp
extends java.lang.Object

A sample application of text-based retrieval which enables the indexing and searching of a document collection through a simple, tty-style menu-driven interface.

Version:
$Revision: 1.14 $
Author:
This software was produced by NIST, an agency of the U.S. government, and by statute is not subject to copyright in the United States. Recipients of this software assume all responsibilities associated with its operation, modification and maintenance.

Field Summary
private static java.lang.String converterName
          The name of the class to be used in converting raw document data to document objects
private static DocCollection currentDocCollection
          The document collection currently in use
(package private) static ProxyDocument dummy
           
private static java.lang.String fileName
          The name of the file containing the document collection to be indexed in its "raw" form
private static int finalStage
          Number of processsing stage at which to stop
private static int fromDoc
          In batch mode, the document number from which to begin indexing
private static java.lang.String indexDir
          The name of the directory from which the index files (DB*) are to be read or to which they should be writen
private static java.lang.String irmDir
          The name of the directory from which the Irf Manager is to be read or to which it is to be saved
private static boolean populateByIndexThenDoc
          Flag to indicate whether to create indexes by scanning each document once for each index or by scanning each document once only and incrementing each index for the document scanned
private static java.lang.String queryName
          The name of a file containing a query to be used to search the collection
private static boolean retrieveOnly
          Flag indicating whether this invocation should perform only retrieve operations
private static java.lang.String slash
          The file/directory name separator (e.g.
private static IrfConverter theConverter
          The converter class to be used in converting documents from the "raw" to an indexable form
private static int toDoc
          In batch mode, the document at which to end indexing
 
Constructor Summary
SampleApp()
           
 
Method Summary
(package private) static java.lang.String addHighlight(java.lang.String s, int wantedWord, int wantedParagraph)
           
(package private) static void ApplicationMain(java.lang.String pathToSerializedIrfManagerFiles, java.lang.String pathToIndex)
          Controls the execution of the application through presentation of menus and processing of input.
(package private) static ProxyDocument buildDummyDoc(IrfConverter aConverter)
          Build a dummy (empty) proxy Document
static boolean confirmYN()
          Prompt user for yes or no response.
(package private) static DocCollection getCurrentDocCollection(java.lang.String pathToIndex)
          SetGet the current document collection.
static void main(java.lang.String[] argv)
          The main method of the application, handles command line options.
(package private) static void populateColl(DocCollection docColl)
          Create documents, add them to a collection and index them.
(package private) static Document prepareDocPresent(ResultForDocMatchingQuery rle)
          Takes as argument a ResultForDocMatchingQuery to be presented on the screen.
(package private) static void printMemUseStats(java.lang.String message)
          Print out data on memory usage
(package private) static double[] requestNormalizedWeights(java.util.Vector theModalities)
          Request an integer weight to assign to each indexing modality for use in calculating the contribution of the indexing modality to a document's final score/RSV.
(package private) static void retrieve(RetrievalModalities rm, IndexingModalities im)
          Retrieve documents from a collection using the query supplied by the user (invoked from main menu).
(package private) static Combinator setCombinator(IndexingModalities im)
          Initialize the Combinator.
(package private) static void setIndexingModalities(DocCollection docColl)
          Set the indexing modalities for the current document collection.
private static void STRUCCOMBRET(java.util.Vector C, java.lang.String F, java.lang.Object C1, double W1, int C2, double W2)
           
private static java.util.Vector STRUCLEAF(java.lang.String F, int I1, double W1, int I2, double W2)
          Print a specified number of spaces.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

slash

private static java.lang.String slash
The file/directory name separator (e.g. "/" for Unix, "\" for DOS)

populateByIndexThenDoc

private static boolean populateByIndexThenDoc
Flag to indicate whether to create indexes by scanning each document once for each index or by scanning each document once only and incrementing each index for the document scanned

finalStage

private static int finalStage
Number of processsing stage at which to stop

retrieveOnly

private static boolean retrieveOnly
Flag indicating whether this invocation should perform only retrieve operations

fromDoc

private static int fromDoc
In batch mode, the document number from which to begin indexing

toDoc

private static int toDoc
In batch mode, the document at which to end indexing

fileName

private static java.lang.String fileName
The name of the file containing the document collection to be indexed in its "raw" form

queryName

private static java.lang.String queryName
The name of a file containing a query to be used to search the collection

converterName

private static java.lang.String converterName
The name of the class to be used in converting raw document data to document objects

irmDir

private static java.lang.String irmDir
The name of the directory from which the Irf Manager is to be read or to which it is to be saved

indexDir

private static java.lang.String indexDir
The name of the directory from which the index files (DB*) are to be read or to which they should be writen

theConverter

private static IrfConverter theConverter
The converter class to be used in converting documents from the "raw" to an indexable form

currentDocCollection

private static DocCollection currentDocCollection
The document collection currently in use

dummy

static ProxyDocument dummy
Constructor Detail

SampleApp

public SampleApp()
Method Detail

main

public static void main(java.lang.String[] argv)
The main method of the application, handles command line options.
Parameters:
argv - argument vector

ApplicationMain

static void ApplicationMain(java.lang.String pathToSerializedIrfManagerFiles,
                            java.lang.String pathToIndex)
Controls the execution of the application through presentation of menus and processing of input. In batch(noinput) mode, user input is disabled and the application uses a preprogrammed execution sequence.
Parameters:
pathToSerializedIrfManagerFiles - The directory in which the serialized Irf Manager was stored.
pathToIndex - The directory in which the index was stored.

getCurrentDocCollection

static DocCollection getCurrentDocCollection(java.lang.String pathToIndex)
SetGet the current document collection.
Parameters:
pathToIndex - The directory in which the index was saved.
Returns:
A vector representing an existing or new document collection.

setIndexingModalities

static void setIndexingModalities(DocCollection docColl)
Set the indexing modalities for the current document collection.
Parameters:
docColl - The document collection for which to set the indexing modalities.

buildDummyDoc

static ProxyDocument buildDummyDoc(IrfConverter aConverter)
Build a dummy (empty) proxy Document
Parameters:
aConverter - a IrfConverter for documents of the type to be created.
Returns:
a dummy ProxyDocument

populateColl

static void populateColl(DocCollection docColl)
Create documents, add them to a collection and index them.
Parameters:
docColl - the document collection to populate

prepareDocPresent

static Document prepareDocPresent(ResultForDocMatchingQuery rle)
                           throws java.lang.ArrayIndexOutOfBoundsException
Takes as argument a ResultForDocMatchingQuery to be presented on the screen. The function duplicates the ProxyDocument that the argument refers to, keeping the original fields that do not need to be changed. DeHtmls and DeStrings, however, may need to be changed to mark query terms that appear in the document. For each of those DEs, a new DeHtml/String object is created, based on the old, but with <em> and </em> added around each word matching the query.
Parameters:
rle - ResultForDocMatchingQuery
Returns:
Highlighted version of the Document.

addHighlight

static java.lang.String addHighlight(java.lang.String s,
                                     int wantedWord,
                                     int wantedParagraph)

STRUCLEAF

private static final java.util.Vector STRUCLEAF(java.lang.String F,
                                                int I1,
                                                double W1,
                                                int I2,
                                                double W2)
Print a specified number of spaces.
Parameters:
i - The number of spaces to print.

STRUCCOMBRET

private static final void STRUCCOMBRET(java.util.Vector C,
                                       java.lang.String F,
                                       java.lang.Object C1,
                                       double W1,
                                       int C2,
                                       double W2)

requestNormalizedWeights

static double[] requestNormalizedWeights(java.util.Vector theModalities)
Request an integer weight to assign to each indexing modality for use in calculating the contribution of the indexing modality to a document's final score/RSV.
Parameters:
theModalities - The set of indexing modalities for this collection.
Returns:
An array holding a normalized weight factor for each indexing modality. The normalized factor is calculated by dividing the integer weight assigned to a given indexing modality by the sum of the weights assigned to all indexing modalities.

setCombinator

static Combinator setCombinator(IndexingModalities im)
Initialize the Combinator.
Parameters:
im - The current indexing modalities.
Returns:
A Combinator object.

retrieve

static void retrieve(RetrievalModalities rm,
                     IndexingModalities im)
Retrieve documents from a collection using the query supplied by the user (invoked from main menu).

confirmYN

public static boolean confirmYN()
Prompt user for yes or no response.
Returns:
true if response from user is 'y', false otherwise. Small helpfunction: could be moved to utils.[Ch]:

printMemUseStats

static void printMemUseStats(java.lang.String message)
Print out data on memory usage