SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
The QA System
chapter
J. Driscoll
J. Lautenschlager
M. Zhao
National Institute of Standards and Technology
Donna K. Harman
increase when comparing the 11-pt average here to the 11-pt
average of the originally retrieved text [OCRerr]igure 0).
The re-ranking of documents using keywords was done in
order to establish a baseline for the semantic experiments
reported in the next subsection. The increase in retrieval
performance because of the re-ranking was a surprise.
According to Sparek Jones' criteria, the increases of 25.8%
and 18.8% would eachbe categoriied as "sigmIlcant" [OCRerr]reater
than 10.0%) [4].
5.2 Semantic Expenments
Essentially, what we are trying to show in this section is
that semantics can be useful if documents do not get larger
than a paragraph.
Figure 9 displays results when semantic similarity is
combined with keyword similarity for the case that originally
retrieved [OCRerr]fl[OCRerr]EC documents are used. These results can be
compared to those in Figure 7 (the same documents with a
strictly keywording approach). Comparing the 11-pt average
for the two methods shows a 0.7% [OCRerr]when going from
the strictly keywording results to the combined semantic with
keywording results. According to Sparek Jones' criteria [4],
this is not a noticeable decrease. But, certainly, semantics
did not help.
Top ranked evaluation
Run number: 1
Num[OCRerr]queries: 6
Total number of documents over all queries
Retrieved: 1200
Relevant: 863
Rel[OCRerr]ret: 155
Recall - Precision Averages:
at 0.00 0A370
atOlO 0.2250
at 0.20 0.0789
at 0.30 0.0329
at 0.40 0.0000
at 0.50 0.0000
at 0.60 0.0000
at 0.70 0.0000
at 0.80 0.0000
at 0.90 0.0000
at 1.00 0.0000
Average precision for all points
11-pt Avg: 0.0703
Av
(0[OCRerr]20era[OCRerr](;[OCRerr][OCRerr]e()ct8sI0o)n for 3 intermediate points
3-pt Avg: 0.0263
Recall:
The explanation for this is that when the size ofa document
is large, a greater number of semantic categories are triggered
in the document. Also, the probability present for each
category in a document is often very close to 100%. Con-
sequently, almost every semantic category becomes present
in every document causing the semantic category weights to
become very low and useless.
To remedy this problem, we used the database that was
constructed with paragraph divisions, and computed rele-
vancy lists using the combined semantic and keywording
approach explained in Section 3.3. The results are shown in
Figure 10. The statistics there can be compared to those in
Figure 8 (the same documents with a strictly keywording
approach). When going from a keywording approach to a
combined semantic and keywording approach, the 11-pt
average increased by 7.9%. According to Sparck Jones'
criteria, this change would be classified as "noticeable"
(greater than 5.0%) [4].
The two semantic experiments reported here demonstrate
the main thing that we have learned. Our semantic approach
to text retrieval is only useful when documents are no larger
than a paragraph.
Top ranked evaluation
Run number: 1
Num[OCRerr]queries: 6
Total number of documents over all queries
Retrieved: 1176
Relevant: 863
Rel ret: 158
Recall - Precision Averages:
at 0.00 0.4556
atOlO 0.2231
at 0.20 0.0845
at 0.30 0.0294
at 0.40 0.0000
at 0.50 0.0000
at 0.60 0.0000
at 0.70 0.0000
at 0.80 0.0000
at 0.90 0.0000
at 1.00 0.0000
Average precision for all points
11-pt Avg: 0.0720
[OCRerr] for 3 intermediate points
3-pt Avg: 0.0282
Recall:
Exact: 0.2133 Exact: 0.2159
at Sdocs: 0.0072 at Sdocs: 0.00%
at lOdocs: 0.0128 at 10 does: 0.0171
at 30docs: 0.0508 at 30does: 0.0513
atl00docs: 0.1378 at 100 does: 0.1456
at 200 does: 0.2133 at200 does: 0.2159
Precision: Precision:
Exact: 0.1292 Exact: 0.1332
at Sdocs: 0.2000 at Sdoes: 0.2333
at lOdoes: 0.1833 at lOdoes: 0.2000
at 30docs: 0.2000 at 30does: 02167
atl00does: 0.1700 atl00docs: 0.1767
at 200 does: 0.1292 at 200 does: 0.1317
Figure 9. Results of Semantics with Document Divisions.
206
Figure 10. Results of Semantics with F[OCRerr]ragraph Divisions.