ISR11 Scientific Report No. ISR-11 Information Storage and Retrieval Operating Instructions for the SMART Text Processing and Document Retrieval System chapter M. E. Lesk Harvard University Gerard Salton Use, reproduction, or publication, in whole or in part, is permitted for any purpose of the United States Government. 11-60 experience on a batch-processing 7Q9l[OCRerr] I. Since SMART is an experimental system, no great effort was spent on optimizing the object code for speed, and considerable improvements could be made in many of the programs. Starting: Mounting two tapes, signing on, reading the specifications, etc. requires about two minutes. This represents mostly tape mounting time. Lookup: To look up £ words in a dictionary of a stems takes roughly pq.l0[OCRerr]7 minutes. Statistical phrase searching of words in a list of q phrases takes about [OCRerr]q[OCRerr]l0[OCRerr]5 minutes. Syntactic timlng is exceedingly irregular with the old syntactic programs, and is effectively so slow that nothing useful can be accomplished in a reasonable amount of time. It is hoped that some syntactic analysis runs can be performed with a new revised analyzer to be distributed shortly. Correlations: To correlate [OCRerr] requests against q documents and then sort the correlations, and evaluate, on the order of [OCRerr]q[OCRerr]l0[OCRerr]3 minutes are used up. Present experimental data exist only for the range of between 10 and 50, and a between 50 and [OCRerr]00; these estimates should not be trusted far outside this range. Concept-concept correlations: If [OCRerr] concepts are involved (i.e. p = CONMAX - 1 2 -[OCRerr] CONMIN) the first correlation takes about [OCRerr]p 10 minutes. F'irther iterations should be fast, assuming reasonable cutoffs. Hierarchy: Most of the time is spent in tape shuffling, requiring about five minutes for collections of about 50,000 words. 9. Acknowledgnents The programs described here were written by Mark Cane, Tom Evslin, Guy Hochgesang, Alan Lemmon, Michael Razar, George Shapiro, ana the author.