SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Probabilistic Retrieval in the TIPSTER Collections: An Application of Staged Logistic Regression chapter W. Cooper F. Grey A. Chen National Institute of Standards and Technology Donna K. Harman Since the SLR methodology is hospitable to the introduction of additional clue- types, and indeed might be expected to wring a maximum amount of leverage out of them, the prospect for future, less primitive, SLR systems that combine many types of evidence seems promising. Future Possibilities Though the prototype system described here used only a few simple statistical clues, the SLR approach is general and in principle flexible enough to accommodate most of the clue-types that researchers have been interested in as predictors of relevance. Broadly speaking, retrieval evidence having to do with particular index terms lends itself to exploitation in the form of variables in the first-stage regression equation, while other kinds -- properties of the entire query, entire document, or their relationship -- can be accommodated as variables in the second stage. As an example of possible new evidence at the first stage, suppose by virtue of parsing, suffix analysis, or dictionary lookup some information is available about the parts of speech of the match stems in the query and document. Then an additional cate- gorical variable might be introduced into the first-level regression analysis to represent the match stem's part of speech in the document, on the hunch that some parts of speech (e.g. nouns) should be more heavily weighted than others. The general two-level form of the analysis would remain the same. A further possibility would be to introduce a vari- able to represent the event that the part of speech of the stem as it occurs in the query is the same as its part of speech in the context in which it occurs in the document. Further clues could be introduced at the second stage. In the present experiment the only retrieval evidence introduced at the second stage that was not already present in the first was the document length L, which was intended more as an antidote to a bias in Z than as an independent predictor of relevance in its own right. But nothing prevents any helpful relationship between the query and document from being brought to bear. As an example, suppose a measure of the mutual closeness of the query's match stems in the document is to be introduced on the hypothesis that the closer together the query stems tend to occur in the document, the likelier it is (other things being equal) that the docu- ment is relevant (Keen 1992). Such a measure of proximity could be added as a new variable in the second-level equation, with no other change being needed in the underly- ing statistical framework. Conclusions The TREC results indicate that the SLR methodology is capable of achieving a respectable degree of retrieval effectiveness even when the retrieval evidence is confined to a few simple frequency clues. (`Respectable' in this context means competitive with the median performance of other systems most of which use more elaborate evidence.) Since nothing prevents the incorporation of additional clue types into future SLR systems, and the regression procedure should help to com- bine them with existing clues in an optimal way, the outlook for the retrieval effec- tiveness of the SLR approach seems promising. 2. The prototype SLR system demonstrates that a probabilistic initial ranking can be achieved with a run-time efficiency approximately equivalent to that of a vector 85