IRE
Information Retrieval Experiment
Simulation, and simulation experiments
chapter
Michael D. Heine
Butterworth & Company
Karen Sparck Jones
All rights reserved. No part of this publication may be reproduced
or transmitted in any form or by any means, including photocopying
and recording, without the written permission of the copyright holder,
application for which should be addressed to the Publishers. Such
written permission must also be obtained before any part of this
publication is stored in a retrieval system of any nature.
Introduction 183
esimulation experiment'. All are processes that we recognize to be mixed
e[OCRerr][OCRerr]gnitive/behavioural[OCRerr] They all help us `get on better' with the World in
which we find ourselves by acquiring information for us, i.e. they alter the
mimetic structures that govern our individual and social responses to the
constraints and opportunities offered by other systems. However, investiga-
lion and experiment (treating these as similar though distinct processes) have
Iwo features that simulation does not: an experimental apparatus is needed
(OV them to be implemented (even though, for investigation of document/user
interaction, say, the apparatus may just be a file of records), and no
suppositions are made as to how the information acquired is generated (one
can for example, experimentally measure the acceleration `due to gravity'
without knowing how the system of primitive entities determining the
acceleration are affecting the apparatus and so determining the data).
Simulation, on the other hand, does not require an apparatus, and does
concern itself with how information (data) is generated by a system.
Suppositions are made about the entities making up the system (though the
entities are regarded as `constructions' rather than `descriptions', so that
suppositions is not quite the right term), about the relationships between
entities, and about the effects of system input upon entities and (possibly)
relationships. This definitional structure is then used to predict the output or
outputs of the system, which can be described as information or data. So that
unlike experiment and investigation, the determining entities in simulation
work are not treated as primitive ones but as objects of study. That is
essentially the strength and weakness of simulation work-as it is of all
`theoretical' study: the objects of interest and experimental study are made
explicit but they remain constructs. This may seem a trivial or fine point,
since in practice when simulation is applied at the human level (queues for
tickets, say) or in an area of technology where `laws' are well established, all
entities of relevance seem clearly evident, and some of them are under our
control (e.g. the number of serving booths) or can at least be influenced by us.
But in view of the unclear foundations of information science, it seems
essential to emphasize that the outcomes of simulation work, since they are
based on a human construct, can never surprise' us (and so inform us) as
much as experimental results can. We can be a bit surprised by the results of
a simulation study (e.g. in regard to a pattern of symmetry or an instance of
invariance that we missed in an experiment) but never very surprised, since
the simulation explores a structure that we ourselves created: the results are,
in that sense, tautological or necessary. Just as mathematics as an edifice of
thought is inviolate and `safe' (that's what it's there for), so is a simulation
study. Both lack the open (i.e. receptive to amendment) syntax of science, a
syntax which encourages information feedback that modifies and invigorates
its structure when that information is obtained from experimentation.
Returning to terminology now, we define a simulation study as a `simulation
experiment' when the system's components (e.g. the parameters of probability
distributions) are given certain values, or are explored in a certain order, and
the consequences of same are noted. It is, in regard to Cranfield-type
experimentation in information retrieval, a moot point whether some of such
work should be described as `experiment' or `simulation', simply because it
is not natural phenomena but man-created phenomena that are being
explored. If an information retrieval experiment examines the effect of