SP500207
NIST Special Publication 500-207: The First Text REtrieval Conference (TREC-1)
Appendix C: System Features
appendix
National Institute of Standards and Technology
Donna K. Harman
II. Query construction
(please till out a section for e[OCRerr]tch query construction method used)
A large numl)er of techniques were tried.
A. Automatically built queries (ad hoc)
1. topic fields used
Boolean queries were constructed from a variety of the topic tields. The (lueries
were then ranked p()ssil)Iy using ditTerent flelds.
2. total computer time to build query (cpu seconds) -10
3. which of the followiug were used?
a. term weightilig wi[OCRerr] weights b£Lsed on tenns in topics
b. phrase extraction from topics
i. automatic addition of B(x)lean connectors or proximity operators
III. Searching
A. Total computer tilne to search (cpu seconds)
1. retrieval tilne (total cpu seconds between when a query enters die system until a list of
document numbers are obtained)
2. ranking time (total cpu seconds to sort d(icument list)
These operations (wcurred together. It t()()k 6 lirs to ol)tain a ranked list of 1,000 documents
for each of the 50 queries.
B. Which methods best describe your machine searching mediods?
1. vector space m(XIel
C. What factors are included in your ranking?
1. tenn frequency
2. inverse d(icument frequency
7. proxilnity of terms
9. docuinent lengtli
15. other (specify)
The location of the term in the query. A variety of modeLs were tried.
IV. What machine did [OCRerr]()[OCRerr] conduct die TREC experiment (m? Sun SPARC 2
How much RAM did it have? 128 Ml)
What w[OCRerr]is the clock rate of tile CPU? 25 MIP
V. Some systems are research prototypes and others are commercial.
To help compare diese systems:
I. How much "software CIIL'iilCCflIit(Y" went into the development of your system?
The system is a rol)ust research tool. Limited eff([OCRerr]rt has heen put into speed.
2. Given appropriate resources, could your system be made to run faster? By how much
(estimate)?
Consideral)Iy faster, hut we estimate it would twice [OCRerr]s f[OCRerr][OCRerr]t if we changed the
architecture (we use UNIX pipCs to communicate). It Ls unclear what other
speed-ups can ([OCRerr]cur.
3. What features is your system missing that it would benefit by if it had them?
All sorts of things would l)e nice! A go([OCRerr] form of transaction management would
lie the most useful t(J transform the system into a commercial product.
491