SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) The QA System chapter J. Driscoll J. Lautenschlager M. Zhao National Institute of Standards and Technology Donna K. Harman increase when comparing the 11-pt average here to the 11-pt average of the originally retrieved text [OCRerr]igure 0). The re-ranking of documents using keywords was done in order to establish a baseline for the semantic experiments reported in the next subsection. The increase in retrieval performance because of the re-ranking was a surprise. According to Sparek Jones' criteria, the increases of 25.8% and 18.8% would eachbe categoriied as "sigmIlcant" [OCRerr]reater than 10.0%) [4]. 5.2 Semantic Expenments Essentially, what we are trying to show in this section is that semantics can be useful if documents do not get larger than a paragraph. Figure 9 displays results when semantic similarity is combined with keyword similarity for the case that originally retrieved [OCRerr]fl[OCRerr]EC documents are used. These results can be compared to those in Figure 7 (the same documents with a strictly keywording approach). Comparing the 11-pt average for the two methods shows a 0.7% [OCRerr]when going from the strictly keywording results to the combined semantic with keywording results. According to Sparek Jones' criteria [4], this is not a noticeable decrease. But, certainly, semantics did not help. Top ranked evaluation Run number: 1 Num[OCRerr]queries: 6 Total number of documents over all queries Retrieved: 1200 Relevant: 863 Rel[OCRerr]ret: 155 Recall - Precision Averages: at 0.00 0A370 atOlO 0.2250 at 0.20 0.0789 at 0.30 0.0329 at 0.40 0.0000 at 0.50 0.0000 at 0.60 0.0000 at 0.70 0.0000 at 0.80 0.0000 at 0.90 0.0000 at 1.00 0.0000 Average precision for all points 11-pt Avg: 0.0703 Av (0[OCRerr]20era[OCRerr](;[OCRerr][OCRerr]e()ct8sI0o)n for 3 intermediate points 3-pt Avg: 0.0263 Recall: The explanation for this is that when the size ofa document is large, a greater number of semantic categories are triggered in the document. Also, the probability present for each category in a document is often very close to 100%. Con- sequently, almost every semantic category becomes present in every document causing the semantic category weights to become very low and useless. To remedy this problem, we used the database that was constructed with paragraph divisions, and computed rele- vancy lists using the combined semantic and keywording approach explained in Section 3.3. The results are shown in Figure 10. The statistics there can be compared to those in Figure 8 (the same documents with a strictly keywording approach). When going from a keywording approach to a combined semantic and keywording approach, the 11-pt average increased by 7.9%. According to Sparck Jones' criteria, this change would be classified as "noticeable" (greater than 5.0%) [4]. The two semantic experiments reported here demonstrate the main thing that we have learned. Our semantic approach to text retrieval is only useful when documents are no larger than a paragraph. Top ranked evaluation Run number: 1 Num[OCRerr]queries: 6 Total number of documents over all queries Retrieved: 1176 Relevant: 863 Rel ret: 158 Recall - Precision Averages: at 0.00 0.4556 atOlO 0.2231 at 0.20 0.0845 at 0.30 0.0294 at 0.40 0.0000 at 0.50 0.0000 at 0.60 0.0000 at 0.70 0.0000 at 0.80 0.0000 at 0.90 0.0000 at 1.00 0.0000 Average precision for all points 11-pt Avg: 0.0720 [OCRerr] for 3 intermediate points 3-pt Avg: 0.0282 Recall: Exact: 0.2133 Exact: 0.2159 at Sdocs: 0.0072 at Sdocs: 0.00% at lOdocs: 0.0128 at 10 does: 0.0171 at 30docs: 0.0508 at 30does: 0.0513 atl00docs: 0.1378 at 100 does: 0.1456 at 200 does: 0.2133 at200 does: 0.2159 Precision: Precision: Exact: 0.1292 Exact: 0.1332 at Sdocs: 0.2000 at Sdoes: 0.2333 at lOdoes: 0.1833 at lOdoes: 0.2000 at 30docs: 0.2000 at 30does: 02167 atl00does: 0.1700 atl00docs: 0.1767 at 200 does: 0.1292 at 200 does: 0.1317 Figure 9. Results of Semantics with Document Divisions. 206 Figure 10. Results of Semantics with F[OCRerr]ragraph Divisions.