SP500207 NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1) Classification Trees for Document Routing, A Report on the TREC Experiment chapter R. Tong A. Winkler P. Gage National Institute of Standards and Technology Donna K. Harman Table 3: Performance with Additional Training Data Rel-Ret @ 200 Topic# #Rel _________ _________ adsbal absba2 Max Median Mm 3 304 3 15 130 48 3 4 20 1 1 18 7 1 5 55 4 29 45 29 4 6 68 4 10 46 20 4 7 92 1 22 46 30 1 8 133 4 2 37 16 2 9 157 13 25 41 29 13 10 149 94 69 109 88 46 11 61 15 9 55 26 9 12 82 14 3 56 15 3 13 93 7 25 93 26 7 14 156 38 23 73 52 23 15 515 29 49 74 49 23 16 58 2 17 44 17 2 17 69 23 53 53 23 9 18 95 38 14 49 38 14 19 664 74 56 147 99 56 20 274 111 63 179 121 56 21 16 12 0 16 14 0 22 106 28 8 79 40 8 23 30 2 2 27 7 2 24 253 37 29 96 41 29 25 13 1 1 12 9 1 As for adsbal, the results are mixed. For 10 topics performance improved; for 13 top- ics performance got worse; and for 2 topics performance was unchanged. We also do not see any strong correlation between change in performance and change in the number of positive instances in the training set, or change in the optimal tree size. There is some indication that the while the extra positive instances tended to produce significant changes in optimal tree size, these new trees also tend to have poorer performanc[OCRerr]sug- gesting, as we should expect, that the tree construction process is highly sensitive to local changes in these small training sets. Nevertheless there are some interesting results. 220