MONO91 NIST Monograph 91: Automatic Indexing: A State-of-the-Art Report Automatic Assignment Indexing Techniques chapter Mary Elizabeth Stevens National Bureau of Standards Table 2 (cont.) Investigator Principles and Methods Materials Used Tests Remarks ,tevens and Urban Teaching sample for machine compilation of co-occurrence data for words in titles and abstracts with descriptors assigned to these items. Words in titles and cited titles of new items then run against master list of pre- vious word-descriptor assoc- iation to derive descriptor- selection scores3 highest scoring descriptors (e.g.3 up to 12) assigned. Assoc- iations derived for 1, 600 words co-occurring with any of 70 descriptors pre- viously assi[OCRerr]ned. Two teaching samples, ap- proximately 100 items each with 70% over- lap, drawn from items in- dexed byASTIA. For new items titles and up to 10 cited titles. For 59 test items, assignments of descriptors that had occurred for at least 3% of the sample items agreed with ASTIA assignments 58.1%. However, for all des- criptors assigned by ASTIA, many not available to machine, overall machine accuracy = 40.1%. For 20 items, independently evaluated by several typical users, the chances that one or more people would agree with the machine assignments ranged from 47.1% when 12 descriptors were assigned to 75.0% average agreement with the machine's first choice. All test items co processed and ur different descrip assigned to each, some descriptor[OCRerr] in manual indexir these items are r available to the machine. ~illiams Discriminant analysis. Sample items previously indexed to a 2-level clas- sification system were subjected to word fre- quency counts and the theoretical frequencies of the most significant words in each category were com- piled. For new items, ob- served word frequencies' compared with theoretical frequencies for each cate- gory, highest scoring assigned. Items from "Computer Abstracts on Cards" index- ed to 15 major categories each divided into 10 minor catego- ries. 300 ab- stracts selected to provide equal distribution toZO sub -categories, 5 each in 4major categories. Add- itional items for test similarly selected. For 63 new items assigned by machine to 1 major and 1 minor category, 78% correct at major level, 64% correct at minor level. For 20 items classified to 1 major and 2 minor categories, 95% cor- rect at major lev[OCRerr]l, 60% and 75% correct at the minor level.