NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)

SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Report of Progress for TREC-II chapter W. Kelleher National Institute of Standards and Technology D. K. Harman Basic AssumpUons is the probability that a document is relevant if it contains a particular element. Each element is treated as being inde- pendent of every other element. 3.3 Results Window The results window contains the list of all documents with a probability of relevance to the current query/concept. Documents are ranked by probability of relevance. For each document, the elements found in the document are listed as well as any previous relevance information. The document D selected from the results window can be used in the document window to examine the document. 4.0 Basic Assumptions The approach being taken by FORMS is based a number of critical assumptions. Using the axioms of probabilities as a foundation for determining the relevance of documents to a query * Elements besides words, such as phrases, can be found to make a significant contribution to retrieval accuracy * A number of different approaches to identif[OCRerr]ing rele- vant elements are worth pursuing. These include proper noun identification, part of speech tagging, noun phrase tagging, and as yet undetermined relations that can be extracted from natural language. 5.0 Progress to Date Frankly, there has been little progress to date beyond basic system development. In both TREC I and II, we have per- formed routing queries with very poor results. One obvi- ous reason is that we have had to perform the analysis by breaking the texts into small sections and doing even class B in sections. 6.0 FutureWork FORMS is designed to provide information on the effec- tiveness of different approaches. * To examine different approaches to incorporating rele- vance feedback into evaluating relevance of other doc- uments. Report of Progress for TREC II 277 * To examine whether queries can be analyzed and clas- sified so that the system can determine which approach is most likely to be successful for a particular query or concept. * To examine whether certain kinds of elements (where an element is a word, a phrase, a proper noun, a verb, a cooccurrence etc.) can be predetermined to be helpful in certain types of queries. 7.0 Results & Conclusion At this time no results are supported by work performed with FORMS. However; it is worth emphasizing that the results in TREC seem to support the view that there is no specific approach that is going to revolutionize informa- tion retrieval. Rather, it seems that improvements are going to come from attention to details and fmding the right element to use at the right time.