The NIST PRISE indexer creates an index from text files, marked up in SGML format, to be used by the PRISE search engine. PRISE indexing consists of 8 individual programs which produces a set of data to be used with the Z39.50 interface. It contains a two-step process, that does not need an explicit sorting step. The first step (rel.build.tmm) produces the basic inverted file, and the second step (rebuild.tmm) adds the term weights to the inverted file and reorganizes it for maximum efficiency. The creation of the basic inverted file avoids the use of an explicit sort by using a right-threaded binary tree. Below is a description of the 8 programs used in creating a PRISE index.
-i | initialize parser for new collection |
-w work_dir | directory area where index will be created |
-d data_dir | directory area where input text resides (default: work_dir) |
-p pattern | file wildcard pattern for input text files |