SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
The QA System
chapter
J. Driscoll
J. Lautenschlager
M. Zhao
National Institute of Standards and Technology
Donna K. Harman
It should be noted that the semantic experiments reported
here were crude. Our lexicon did not have enough TREC
words, and we used a blend of keyword and semantic weights
that was not the best. So we expect that better semantic results
for paragraphs can be achieved (refer to Section 6).
6. Failure Analysis
In general our participation in the TREC experiments was
impeded by the following:
PC DOS Platform. This platform has a serious memory
addressing restriction which results in memory page swap-
ping and this seriously affects the speed of processing,
especially during creation of inverted files and index
structures. We can solve this problem by moving to an OS/2
or UNIX platform.
Extra Semantic Processing Time. Our semantic proba-
bilistic and statistical calculations more than double the
processing time for indexing and statistical ranking of
retrieved documents. Again, we can solve this problem by
moving to an OS/2 or UNIX platform.
Time to Build Semantic lexicon. We were only able to
incorporate 1000 frequently occurring words in the training
text within our semantic lexicon. We did not have enough
time to process the test text for the ad-hoc queries. This
problem can be solved by having archival data distributed
earlier. We suspect that by having more TREC words in our
semantic lexicon, better results could have been achieved in
Section 5.2 when paragraphs are used as the basis for retrievaL
Unknown Blend for Semantic and Keyword Weights.
There are three main aspects to our blend of semantic and
keyword weights within the vector processing model:
(i) The Proper Probabilities to Use for the Semantics
Triggered by a Word. For example, we let the word
"vapor" trigger State with 18% probability, Tem-
perature with 9% probability, and Motion with
Reference to Direction with 9% probability. We
have several techniques for determining probabili-
ties such as these.
(ii) The Scaling of Keyword Weights and Semantic
Weights. For example, in a Question/Answer
environment where queries are the length of a
sentence and documents are either a sentence or at
most a paragraph, we have been successful by
forcing semantic similarity to be approximately 1/3
of keyword similarity when the two are combined
in processing small document collections (1ess than
1000 documents). There was no scaling for the
experiments reported in Section 5.2; we suspect
better results could have been achieved.
(iii) Independent Semantic Weights and Keyword
Weights. A valid criticism of our research has been
that the semantic contribution from a word in a
document should be kept independent of the word's
own similarity contribution if the word is a keyword
in common with the query.
The overall problem of proper blend can be solved by
spending more time using TREC test documents, test topics,
and good test relevance judgments to run many retrieval
experiments to establish the correct blend.
207
Number of Semantic Categories. Mother way to solve
the problem of long documents causing semantic weights to
be of little value is to have more semantic categories. A large
number of "semantic" categories could be obtained (for
example) by using [OCRerr]11 the categories and/or subcategories
found in Roget's Thesaurus, instead of the 36 semantic
categories we use. This would be a deviation from database
semantic modeling but it probably should be examined.
Block-split Tree Structured Files. The QA System used
B+ tree structures for implementing inverted files and this
actually slowed the system in our DOS environment. The
QA System also had severe storage overhead due to storage
of character strings in the B+ trees. We have solved these
two problems by implementing a separate system using a
hashing function to establish codes for strings.
TREC Document length. Semantic experiments like
those reported in Section 5.2 have shown that documents
larger than a paragraph cause our semantic approach to be of
little value. This problem can be corrected by considering
paragraphs as a basis for document retrieval.
Finally, we spent too much time on work that was never
incorporated in our experiments. We originally designed an
efficient method of inverting data files, but it could not be
used for routing queries. Also, trying to do semantic
part-of-speech tagging experiments using SQUDS slowed us
down.
References
[1) C. Buckley, SMART Evaluation Program (for TREC),
Cornell SMART Group, Cornell University.
[2] C. Date, An Introduction to Database Systems, Vol.1,
Addison Wesley, 1990.
[3] Hello Software, P.O. Box 494, Goldenrod, FL 32733.
[4] K. Sparck Jones and R. Bates, "Research on Automatic
Indexing 1974-1976," Technical Report, Computer
laboratory, University of Cambridge, 1977.
[5] J. Ii)vins, "Development of a Stemming Algorithm,"
Mechanical Translation and ComputationalLinguistics,
Vol.11, No.1-2, pp.11-31, March and June, 1968.
[6] Roget's International Thesaurus, Harper & Row, New
York, Fourth Edition, 1977.
[7] G. Salton,Automatic T&tProcessing, Addison-Wesley,
1989.
[8] D. Voss and J. Driscoll, "Text Retrieval Using a Com-
prehensive Semantic lexicon," Proceedings of ISMM
First International Conference on Informadon and
Knowledge Management, Baltimore, Maryland,
November 1992.
[9] E. Wendlandt and J. Driscoll, "Incorporating a Semantic
Analysis into a Document Retrieval Strategy," Pro-
ceedings of the Fourteenth Annual International
ACM/SIGIR Conference on Rese""rch and Development
in Information Retrieva4 Chicago, Illinois, pp. 270-279,
October 1991.