MONO91
NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report
Other Potentially Related Research
chapter
Mary Elizabeth Stevens
National Bureau of Standards
In terms of our present concern, however, we shall select only a few examples.
"By automatic content analysis is meant the use of computer programs to detect or select
content themes in a sentence-by-sentence scanning of text or verbal protocols". 1/ The
interest of psychologists in machine techniques to assist in the analysis of linguistically-
given materials, as in propaganda analysis, probably precedes at least in sophistication
if not by ci[OCRerr]ate, that of documentalists or of machine specialists interested in library and
information problems.
3/.
The "General Inquirer" program developed by Stone et al, - is an example of
question-answering techniques based upon selective extractions from natural language
text. It involves the use of a master vocabulary consisting of words previously selected
by an investigator as being likely to be content-indicative in a body of material to be
processed, together with his pre-established indications of the categories he expects
their occurrence should predict. It is to be noted that this is a custom-tailored set of
categories and of clue-word lists associated with each, manually pre-established. Text
is now processed in such way that each word is looked up and, if it appears in the master
vocabulary, it is tagged with identifiers of the categories for which it is presumably
predictive. A subsequent "Tag Tally" routine then counts the tag frequencies to deter-
mine for which categories the input material has high or low scores, and these in turn
can be compared with expected norms.
This type of program has been applied to such varied materials as suicide notes,
folk tales from different cultures, reports of field workers, recordings of group dis-
cussions as in supervisory-leadership training sessions, and protocols for various
psychological tests. 4/ Interesting variations developed by Jaffe and others 5/ involvc
the use of non-verbal as well as verbal clues as content-indicators, specifically, time-
sequence patterns recorded along with the words spoken in client-therapist sessions. At
the meeting of the Association for Computational Linguistics and Machine Translation
held in Denver, August, 1963, Jaffe reported findings indicative of positive correlation
between the structure of temporal and lexical patterns in dialogue and suggested applica-
tions to automatic abstracting or indexing by the use of the time-sequence patterns as
clues to high information-value areas.
1/
2/
3/
4/
5/
[OCRerr]ord, Jr. b 1963 [498], p.3.
See, for example, Jaffe 1952 [297], Hart and Bach, 1959 [256], POQ4, 1959 [475],
the latter covering the proceedings of a conference held in 1955.
Stone and Hunt, 1963 r576[OCRerr]J; Stone et al, 1962 [575j.
See Ford, 1963 [498], p. 8.
See for example, Cassotta, et al, 1964 L[OCRerr]o4J Jaffe, L2"43t0 L[OCRerr][OCRerr](j
137