SP500215 NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) Report of Progress for TREC-II chapter W. Kelleher National Institute of Standards and Technology D. K. Harman User Interface F ,SGML~ %i2¼-ments¾ /rn'Document `-<Index) ,Document~ Profiles Document Display Process Query Process Query Results [OCRerr] Process ( ½ Query Profile Element [OCRerr] Index Concept Library 2.2 Document Profiles Elements are the things found in text documents that cap- ture the meaning imbedded in the documents. The basic element in text is obviously the words. But there are other elements that can be very helpful. Identifying certain kinds of proper nouns can be crucial to determining the rele- vance of text to an information need. Important proper nouns might be persons, places, and some kinds of things, e.g. companies. There are two distinct types of elements used in FORMS, simple and complex. A simple element consists of a char- acter string, e.g. a word, and an element type, e.g. a noun or a noun phrase. Complex elements are various combina- tions of simple elements. For example a boolean `and' of several simple elements occuning in the same sentence, paragraph or document would be a complex element. In general simple elements are only used to create docu- ment profiles elements. A document profile element IS-A simple element that includes the document id and the num- ber of occurrences of Th. element in the document. full text of the document is also created so that the user can retrieve a document for inspection. 3.0 User Interface The user intefface objects (grey area in the diagram) pro- vide object to display documents, retrieve documents using a natural language query as well as stored concepts, and a window that displays results. 3.1 Document Display The document display window permits a user to enter a document ID and the system will simply retrieve the docu- ment from one of the text files and display it on the screen. When a document is displayed, the user has the option to indicate that the document is or is not relevant to the cur- rent query (or concept). When an indication of relevance is given the system modifies the concept/query profile and provides a new rank list of retrieve documents. 3.2 Query Window 2.3 Indexes Mter the document profiles are created, those elements that occur in more than one document are stored in the ele- ment Index. An second index from the document ID to the The query window actually consists several parts. A natu- ral language part for entering a query, and a query profile part that displays the elements of the query/concept along with various statistics about each element. Most important Report of Progress for TREC II 276