SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Okapi at TREC chapter S. Robertson S. Walker M. Hancock-Beaulieu A. Gull M. Lau National Institute of Standards and Technology Donna K. Harman Bibliographic record access is also fast because there is no indirection: the postings records directly address the biblio- graphic records, so again there is only one disk access per record. File inversion is relatively slow and cpu-bound because of the multi-pass linguistic processing during index term extraction. As a rough guide, inversion runs at about one minute per megabyte of indexable text on a lightly loaded Sun 4/330. Limits Maximum bibliographic file size: 32 gigabytes but maximum index size 4 gigabytes Number of records per database: no practical limit Postings per index term: no practical limit Maximum amount of data which can be treated as a `trecord" for retrieval purposes: this is a system parameter usually set to 16 kilobytes. Up to 64 K or more is acceptable. Maximum field length: same as record size Maximum number of fields per record: 31 Maximum index term length: 127 characters Maximum number of terms in single query: 32 (interactive Okapi only) 30