SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Site Report for the Text REtrieval Conference
chapter
P. Nelson
National Institute of Standards and Technology
Donna K. Harman
rnconQuestI
\&¼ctionaIY
Find A
Query Remove
A Enhancement Stop Words B
Expand earch &
B Meanings Documents
Semantic
[Networks I mindexes
Figure 4 The Query Process
The following is a description of the modules used for query:
* Tokenize, Morphology, Find Idioms: These modules are the same as for indexing.
* Query Enhancement: The user is given the opportunity to enhance the query for
additional improvement in precision and recall. There are many options available here,
but the two most important are to choose meanings and weight query terms.
Choosing a meaning of a word will restrict the expansion of words to only related terms
which are relevant to the chosen meanings. This reduces noise in the query. When
running in automatic mode, ConQuest expands all meanings of all words.
Weighting query terms identifies the importance of the various words in the query.
These weights are used by the search engine when ranlling documents and computing
document relevance factors.
* Remove Stop Words: Small function words-such as determiners, conjunctions,
auxiliary verbs, and small adverbs-are removed from the query, just as they were
during indexing. Removing these terms makes queries faster, and also reduce
ambiguous noise in the query.
* Expand Meanings: Words in the query are expanded to include related terms.
* Search and Rank: ConQuest uses an integrated search and rank algorithm which
considers the relevance rankings of documents throughout the search process. Since
ranking and search are integrated, the search engine automatically produces the most
relevant documents right away. This is different than past approaches, which typically
retrieve all matching documents and then rank and sort them as a separate step.
292