ISR11
Scientific Report No. ISR-11 Information Storage and Retrieval
Operating Instructions for the SMART Text Processing and Document Retrieval System
chapter
M. E. Lesk
Harvard University
Gerard Salton
Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government.
ii-i6
3.5. Vector Formation
The final vector is formed by c[OCRerr]iibinjng all the concepts introduced
from all sources with appropriate weights. The weighting parameters are
nun[OCRerr]bers which are zero or greater (and may be greater than 1), indicating
the weight associated with each unit occurrence of the given concept in
the specified class. The weighting parameters are:
STE[OCRerr][OCRerr] x x is the weight associated with word stem concepts. It
should be a FORTRAN floating point number (e.g. 1.0, not 1);
STATWT x x is the weight for statistical phrases STATPR rniist have
been specified when lookup was done; otherwise this
c[OCRerr]imiand is ignored. If STATPR was specified, phrases may
be ignored by giving STATWT 0.0;
x is the weight associated with syntactic phrases;
x is the weight associated with concept-concept expansion
concepts
SYNWT x
COCOWT x
The hierarchical weighting specifications are used to select the
method of hierarchical expansion. Any inmber of expansions with arbitrary
weights may be performed simultaneously. There are four weighting para-
meters, one for each possible mode of expansion. If the weight for a
possible expansion mode is zero, that expansion is not performed. If the
weight for a possible expansion is not zero, the expansion is performed
and the concepts derived from it are given the indicated weight.
ROOTWT x specifies the weight for expansion by parents;
BRANWT x specifies the weight for expansion by brothers;
L[OCRerr]A1WT x specifies the weight associated with expansion by sons;
CROSWT x specifies the weight associated with expansion by cross-
references.