SP500215
NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2)
On Expanding Query Vectors with Lexically Related Words
chapter
E. Voorhees
National Institute of Standards and Technology
D. K. Harman
3 Experiments
The training data for the routing queries was used both
to refine the synsets that were included in the topic text
and to select the type of relations used to expand the
query vectors. In some cases a synset that appears
to be a logical choice for a query is nonetheless detri-
mental. For example, adding the synset for dea[OCRerr]h to
topic 59 (weather fatalities) causes the query to re-
trieve far too many articles reporting on deaths that
have no relation to the weather. I produced five differ-
ent versions of synset-annotated topic texts, although
the differences between versions are not very large. The
version used in the official routing run added an average
of 2.9 synsets to a topic statement, with a minimum of
o synsets added and a maximum of 6 synsets added.
Of course, the utility of a synset depends in part on
how that synset is expanded and the relative weights
given to the different link types (the a's in the similar-
ity function above). Table 1 lists the various combina-
tions that were evaluated using the training data. Four
different expansion strategies were tried: expansion by
synonyms only, expansion by synonyms plus all descen-
dents in the is-a hierarchy, expansion by synonyms plus
parents and all descendents in the is-a hierarchy, and
expansion by synonyms plus any synset directly related
to the given synset (i.e., a chain of length 1 for all link
types). Different a values were also investigated. As-
suming original query terms are more important than
added terms, the a for the original terms subvector was
set to one and the a for other subvectors varied between
zero and one as shown in Table 1.
The most effective run was the one that expanded a
query synset by any synset directly related to it and had
a = .5 for all added subvectors. Therefore, this strategy
was used to produce the official routing queries from the
final version of the annotated text. The scheme added
an average of 24.7 words to a query vector (minimum
0, maximum 70), and an average of 20.2 (0, 66) words
that are not part of the original text.
The average number of relevant documents retrieved
at rank 100 for this run is 40.7 and at rank 1000 is
133.3; the mean "average precision" is .2984. In gen-
eral, the individual query results are at or slightly above
the median, with a few queries significantly below the
median. Of more interest to this study is how the ex-
panded queries compare to unexpanded queries. A plot
of average recall versus average precision for these two
runs is given in Figure 3. As can be seen, the effective-
ness of the two query sets is very similar.
Since there was no way to evaluate the relative ef-
fectiveness of different expansion schemes for the ad
hoc queries, the same same expansion scheme as was
used for the official routing run chains of length one
for any relation type and all a's = .5 was used for
227
the ad hoc run. Furthermore, there could be no re-
fining of which synsets to add, so only one version of
synset-annotated text was produced. An average of 2.7
(minimum 0, maximum 6) synonyms was added to an
ad hoc topic text. The expansion process added an av-
erage of 17.2 (0, 66) terms and 12.8 (0, 55) terms that
are not part of the original text.
Siemens actually submitted two ad hoc runs. The
first was the expanded queries with a's set to 0, a run
that is equivalent to no expansion and is used as a base
case. The second Siemens ad hoc run used the .5 a val-
ues. A plot of the effectiveness of the two ad hoc runs
is given in Figure 4. The differences in effectiveness be-
tween unexpanded and expanded queries is even smaller
for the ad hoc queries than it is for the routing queries.
The average number of relevant documents retrieved at
rank 100 is 46.9 for both the unexpanded and expanded
queries. The average number of relevant documents re-
trieved at rank 1000 is 161.4 for the unexpanded queries
and 161.3 for the expanded queries. The mean "average
precision" is .3408 and .3397 respectively.
A possible explanation for the little difference made
by expanding the queries is that the expansion param-
eters used were too conservative. To test this hypoth-
esis, additional runs were made using the same set of
synsets but allowing longer chains of links and/or using
greater relative link weights (the a's). Table 2 lists the
additional combinations tested using both the ad hoc
queries versus the documents on disks one and two,
and the routing queries versus the documents on disk
3. As was the case for the routing training runs, the
strategy used for the official TREC-2 runs (all links of
length one, a's = .5) was the most effective expansion
strategy. The more aggressive expansion strategies did
make larger differences in retrieval effectiveness com-
pared to the unexpanded queries, but across the set of
queries the aggregate difference was negative. Hence it
is unlikely that the conservative expansion strategy is
the reason for the lack of improvement.
4 Conclusion
The experimental evidence clearly shows this query ex-
pansion technique provides little benefit in the TREC
environment. The most likely reason for why this
should be so is the completeness of the TREC topic de-
scriptions. Query expansion is a recall-enhancing tech-
nique and TREC topic descriptions are already large
compared to queries found in traditional IR collections.
Although most of the expanded queries did have some
new terms added to them, the most important terms
frequently appeared in both the original term set and
the set of expanded terms. This had an effect on the
relative weight of those terms in the overall similarity