Summarising Factors - Karen Sparck Jones 20 May 2004 This list of input, purpose and output factors is lifted from ms ksjnewwrite (6/1996) with modifications as per KSJ's paper in Mani and Maybury 2000 and some further improvements. The various factors are informally presented, but are nevertheless real. Note that the source properties under input form, and summary properties under output expression, are conjunctive: all the other factors are contrastive. INPUT FACTORS a) the FORM of the source e.g. academic article news message progress report this subsumes structure e.g. header; amplification ... , introduction; background; objective; ... scale (or length) e.g. book, article; 10K words, 1K words medium e.g. English, Shipspeak genre e.g. description (as in an encyclopedia), narrative b) the SUBJECT TYPE of the source ordinary e.g. daily news specialised, alias technical e.g. drug chemistry restricted, alias local e.g. organisation plans c) the UNITS taken as source single e.g. scientific paper multiple e.g. news stories (a single source may have several subunits e.g. book chapters; the distinction is whether the set of units to be summarised was intended to go together or consists of independent members) PURPOSE FACTORS a) the SITUATION, i.e. context, within which the summary is to be used tied e.g. to company X's marketing drive for product P on date D floating e.g. Computing Reviews - for anyone interested for any reason b) the AUDIENCE for a summary can be characterised as untargetted e.g. a mass market womens' magazine's readers targetted e.g. UK family court lawyers (the former covers a wide variety of skills, experiences and interests, the latter is much narrower) b) the USE, or function, for which the summary is intended retrieval aid e.g. Web engine results page snippet(s) preview device e.g. report executive summary, submitted paper abstract refresher e.g. old report resume alert e.g. advertising blurb for novel OUTPUT FACTORS a) the MATERIAL of the summary, i.e. the information it gives, in relation to that in the source covering i.e. all main concepts of source in summary e.g. aims, background, data, tests, results, conslusions of biology paper partial i.e. some concepts (types) only e.g. experimental methods in biology paper b) the FORMAT of the summary i.e. the way the summary information is expressed explicitly structured presentational layout e.g. headers and slots in course synopsis, boxed examples in textbook running text (primarily) e.g. summary review of novel non-text e.g. graph, cartoon c) the STYLE of the summary i.e. the relationship to the content of the source informative i.e. explicitly giving source content e.g. restatement of facts given in an accident report indicative i.e. indicating area of source content e.g. declaring a biography is about some person critical i.e. evaluating the content (perhaps other features) of the source e.g. claiming a novel is well written trash aggregative i.e. setting parts of the source material against one another e.g. laying out arguments for and against a policy in a meeting record d) the EXPRESSION of the summary i.e. all the linguistic feaures of the summary this subsumes language e.g. English, French register e.g. technical jargon in legal abstract, popular writing for newspaper modality e.g. narrative for newscast, description for weekly magazine e) the BREVITY of the summary i.e. relative or absolute scale (length) of the summary e.g. 5 percent of source, half-page, 100 words (Where both source and summary are linguistic objects there are clearly common attributes e.g. medium or genre ; however there is no requirement that values should be correlated e.g. English in, English out. Further, while features of the input may have to be explicitly captured for effective source interpretation, because summaries are new creations, explicit choices for every aspect of the summary have to be made. The output factors are therefore organised rather differently from the input ones, and I have also used different labels for comparable attributes (eg medium/language, genre/modality).)