|
NIST Monograph 91:Automatic Indexing: A State-of-the-Art ReportTable of Contents |
||||
| Page | |||||
| Abstract |
1 | ||||
| 1. Introduction |
1 | ||||
| 1.1 Definitions and background | 2 | ||||
| 1.2 Scope of this study | 10 | ||||
| 1.3 Derivative vs. assignment indexing |
13 | ||||
| 2. Indexes compiled by machine |
14 | ||||
| 2.1 Concordances and complete text processing | 15 | ||||
| 2.2 Card catalogs, book catalogs, bibliographies and subject index listings prepared by machine |
19 | ||||
| 2.3 Tabledex and other special purpose indexes | 25 | ||||
| 2.4 Citation indexes | 27 | ||||
| 2.5 Machine conversion from one index set to another |
38 | ||||
| 3. Indexes generated by machine - automatic derivative indexing |
40 | ||||
| 3.1 KWIC indexes   |
40 | ||||
| 3.1.1 Applications of KWIC indexing techniques | 41 | ||||
| 3.1.2 Advantages, disadvantages and operational problems of KWIC indexing   |
55 | ||||
| 3.2 Modified derivative indexing   |
68 | ||||
| 3.2.1 Title augmentation | 68 | ||||
| 3.2.2 Book indexing by computer | 71 | ||||
| 3.2.3 Modified derivative indexing - Baxendale's experiments   |
73 | ||||
| 3.3 Derivative indexing from automatic abstracting techniques   |
75 | ||||
| 3.3.1 Auto-condensation and auto-encoding techniques of H. P. Luhn | 75 | ||||
| 3.3.2 Frequencies of word n-tuples - Oswald and others | 79 | ||||
| 3.3.3 Relative frequency techniques - Edmundson and Wyllys, and others |
81 | ||||
| 3.3.4 Significant word distances | 83 | ||||
| 3.3.5 Uses of special clues for selection | 84 | ||||
| 3.3.6 Recent examples of mixed systems experimentation   |
86 | ||||
| 3.4 Quality of modified derivative indexing by machine   |
89 | ||||
| 4. Automatic assignment indexing techniques |
91 | ||||
| 4.1 Swanson and later work at Thompson Ramo Wooldridge | 91 | ||||
| 4.2 Maron's automatic indexing experiments | 93 | ||||
| 4.3 Automatic indexing investigations of Borko and Bernick | 94 | ||||
| 4.4 Williams' discriminant analysis method | 97 | ||||
| 4.5 SADSACT | 98 | ||||
| 4.6 Assignment indexing from citation data | 99 | ||||
| 4.7 Similarities and distinctions among assignment indexing experiments | 100 | ||||
| 4.8 Other assignment indexing proposals   |
105 | ||||
| 5. Automatic classification and catagorization |
106 | ||||
| 5.1 Factor analysis | 108 | ||||
| 5.2 The theory of clumps | 110 | ||||
| 5.3 Latent class analysis | 113 | ||||
| 5.4 Examples of other proposed classificatory techniques |
113 | ||||
| 6. Other potentially related research |
114 | ||||
| 6.1 Thesaurus construction, use and up-dating | 114 | ||||
| 6.2 Statistical association techniques |
118 | ||||
| 6.2.1 Devices to display associations: EDIAC | 119 | ||||
| 6.2.2 Statistical association factors - Stiles | 119 | ||||
| 6.2.3 The association map - Doyle and related work at SDC | 122 | ||||
| 6.2.4 Work of Giuliano and associates, the ACORN devices | 124 | ||||
| 6.2.5 Spiegel and others at Mitre Corporation |
126 | ||||
| 6.3 Clues to index-term selection from automatic syntactic analysis | 127 | ||||
| 6.4 Probabilistic indexing and natural language text searching |
132 | ||||
| 6.4.1 Probabilistic indexing - Maron, Kuhns and Ray | 133 | ||||
| 6.4.2 Natural language text searching - Swanson | 134 | ||||
| 6.4.3 Full text searching - legal literature |
135 | ||||
| 6.5 Other examples of related research in linguistic data processing | 136 | ||||
| 6.6 Machine assistance in translations of subject content indications to special search and retrieval language |
140 | ||||
| 6.7 Example of a proposed indexing system utilizing related research techniques |
142 | ||||
| 7. Problems of evaluation |
143 | ||||
| 7.1 Core problems | 145 | ||||
| 7.2 Bases and criteria for evaluation of automatic indexing procedures |
149 | ||||
| 7.2.1 The Cranfield project | 150 | ||||
| 7.2.2 O'Connor investigations | 151 | ||||
| 7.2.3 Questions of comparative costs | 153 | ||||
| 7.2.4 Summary: potential advantages as bases for evaluation |
156 | ||||
| 7.3 Findings with respect to inter-indexer and intra-indexer consistency | 157 | ||||
| 7.4 Special factors and other suggested bases for evaluation |
160 | ||||
| 8. Operational considerations |
164 | ||||
| 8.1 Questions of input | 164 | ||||
| 8.2 Examples of processing considerations | 168 | ||||
| 8.3 Output considerations |
171 | ||||
| 9. Conclusion: Appraisal of the state of the art in automatic indexing |
173 | ||||
| Acknowledgements |
182 | ||||
| Appendix A: List of references cited and selected bibliography |
183 | ||||
| Appendix B: Progress and prospects in mechanized indexing |
223 | ||||
| Appendix C: Selective bibliography of additional references |
237 | ||||
|
Retrieval Group home page IAD home page Date updated: Date created: Monday, 18-Sept-00 |