SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Appendix B: System Features Appendix National Institute of Standards and Technology D. K. Harman IC. CONSTRU[OCRerr]ON OF INDICES, KNOWLEDGE BASES, AND OTHER DATA STRUCTURES -- DATA BUILT FROM OTHER SOUB OThS: ] There are around 100() semantic categories used. The original 1911 Roget major categories are used by removing the suffix on our semantic cod[OCRerr] example, the semantic category 12lnv[OCRerr] is shortened by ignoring nv.3. ] Since the 1911 edition of Roget's Thesaurus became public domain recently, we spent approximately 16 hours creating the software to process ti Thesaurus. Approximately 6 hours of processing time was required to automatically extract 20,000 lexicon entries.