<DOC> 
<DOCNO> SP500215 </DOCNO>         
<TITLE> NIST Special Publication 500-215: The Second Text REtrieval Conference (TREC-2) </TITLE>         
<SUBTITLE> Appendix B: System Features </SUBTITLE>         
<TYPE> Appendix </TYPE>         
<PAGE CHAPTER="B" NUMBER="19">                   
<AUTHOR1>   </AUTHOR1>  
<PUBLISHER> National Institute of Standards and Technology </PUBLISHER> 
<EDITOR1> D. K. Harman </EDITOR1> 
<COPYRIGHT MTH="March" DAY="" YEAR="1994" BY="National Institute of Standards and Technology">   
 
</COPYRIGHT> 
<BODY> 
  B. CONSTRUCTION OF INDICES, KNOWLEDGE BASES, AND OIlIER DATA STRUCTURES-- STATISTICS ON DATA SmUCTURES (C


  OTES:

  ]    Occurrence statistics for the most frequently occurring (in learning set rel docs) 1000 terms for each routing query.

  ]    For the adhoc runs, the `query regression' method was used. The query regression coefficients were computed from the query.nnn and doc.lsp-file (wh
       created by polynomial regression). Afterwards reweighting of the q3-query-file. 4 query.nnn -&gt; query.lsp.

  ]    Because we used the UMASS INQUERY system and its indexing, all of the answers to the questions in this section for our systems are identical to tt
       the UMASS system.

  ]    Document vector files and term dictionary produced by SMART: Fach individual collection was indexed separately, so sizes/times are average per col
       with the range of values specified. The collection statistics are based on the summation of individual collection values so are perhaps less accura
       collection size of the term dictionary cannot be effectively estimated with this approach. Term positions are not stored within the document vecto

                                                       Average     Range         Collection

             Document Vector Files (MB)                120         31-124        1100
             Term Dictionary (MB)                      16          15-17         Unknown

             Time to create both above files (Hours)   10          6-14          120

  5]   Standard process as implemented by SMART, following parameters as in Part I, Section A.

</BODY>                  
</PAGE>                  
</DOC>