IRE Information Retrieval Experiment An experiment: search strategy variations in SDI profiles chapter Lynn Evans Butterworth & Company Karen Sparck Jones All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, including photocopying and recording, without the written permission of the copyright holder, application for which should be addressed to the Publishers. Such written permission must also be obtained before any part of this publication is stored in a retrieval system of any nature. Experiment 289 (4) Co-ordinate matching of groups of terms without weights (CG) The profile search terms are divided into groups representing the various concepts in the query. The output is ranked, first in order of group Co- ordination level (i.e. number of matching terms where each is from a different profile group), and then in order of total number of matching terms. (5) Group-weight cumulation (GWC) The profile term-groups of (4) are weighted according to the relative importance of the groups (concepts) to the query. The weights of all matching groups are summed to produce a document score. The output is ranked in order of document scores. (6) Group-/term-weight cumulation (OTWC) The term-groups of (4) are weighted according to their importance to the query and the individual terms are weighted according to their importance within their group. The output is ranked, first by sum of matching-group weights, second by sum of highest-weighted matching terms from each group, and third by sum of all matching-term weights. (7) Co-ordinate matching ofgroups of terms with weights (COW) The profile term-groups of (4) and the individual terms are weighted. The output is ranked, first in order of group co-ordination level, second in order of sum of matching-group weights, and third in order of sum of matching-term weights. (8) Boolean logic (B) The profile term-groups of (4) are governed by boolean statements which must be satisfied before any output is obtained. The output is in document number order, i.e. unranked. (9) Boolean logic with weights (BW) The profile term-groups of (4) are governed by boolean statements which must be satisfied before any output is obtained. After the boolean equations are satisfied the ranking of output may be based on group- and/or term-weight cumulation procedures. In our experiment only term weights were used. Basically procedures (1), (2) and (3) involve search profiles comprising a single list of terms (weighted or unweighted) while procedures (4[OCRerr](9) inclusive involve profiles comprising groups of terms (in which groups and/or terms may be weighted or unweighted). In the weighted profile versions two types of weights were used. In procedures (2), (3), (5) and (9) above, the weights were subjectively assigned by the compiler, while in procedures (6) and (7) `powers of 2' weighting was used9. In `powers of 2' weighting the weights are assigned routinely once the order of importance of individual terms and/or term groups in the search profile has been intellectually decided. This ordering was again decided by the compiler. Profiles incorporating automatically-assigned weights were not included in the study mainly because the necessary statistics (term frequencies, etc.) were not immediately available. They were to become available subsequently from another INSPEC research project. In addition to the 9 search strategies listed above, as the project proceeded it was decided to include a further two types: