IRE Information Retrieval Experiment The Smart environment for retrieval system evaluation-advantages and problem areas chapter Gerard Salton Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. 318 The Smart environment for retrieval system evaluation effective content identifiers characterizing natural language texts. Among the linguistic techniques of interest, the following were considered to be of greatest importance: (a) The use of hierarchical term arrangements, relating the content terms in a given subject area. With such preconstructed term hierarchies, the standard content descriptions can be `expanded' by adding hierarchically superior (more general) terms as well as hierarchically inferior (more specific) terms to a given content description. (b) The use of synonym dictionaries, or thesauri, in which each term is included in a class of synonymous, or related terms. Using a thesaurus each originally available term can be replaced by a complete class of related terms thereby broadening the original context description. (c) The utilization of syntactic analysis systems capable of specifying syntactic roles for each term and of forming complex content descriptions consisting of term phrases and large syntactic units. A syntactic analysis scheme makes it possible to supply specific content identifications and avoids confusion between composite terms such as `blind Venetian' and `Venetian blind. (d) The use of semantic analysis systems in which the syntactic units are supplemented by semantic roles attached to the entities making up a given content description. Semantic analysis systems utilize various kinds of knowledge extraneous to the documents, often specified by preconstructed semantic graphs' and other related constructs. The design of the original Smart system was then based on the premise that effective automatic indexing procedures could be built by incorporating into a content analysis system one or more of the foregoing language processing methods. Most of the required constructs such as the hierarchical term arrangements and the syntactically analysed text excerpts could be represented by abstract trees, and other constructs such as semantic graphs and thesauri are easily represented by graph structures. Well known automatic procedures were also available for traversing and manipulating tree and graph structures5. The original Smart system was then designed to process natural language texts using these complex data structures. To validate the linguistic analysis procedures it was necessary to compare the search results obtained by using term hierarchies and thesauri with other simpler systems based on the use of single, frequency-weighted terms extracted from the document texts. From the beginning, the Smart system thus contained an evaluation package based on the use of sample document and query collections and on the availability of full relevance assessments specifying the presumed relevance of each document with respect to each user query. This made it possible to compute for each processed query the recall and precision values measuring respectively the proportion of relevant items retrieved and the proportion of retrieved items that are relevant. The early tests in turn led to additional experiments and to the development of a full evaluation system for a large variety of search and retrieval procedures. These developments are described in more detail in the remainder of this study.