Evaluation
The data for each topic/system/size (T/S/S) combination end with a set of triples (x,y,z) of which y seems to represent the degree to which the unit, for this triple, �covers� the model concept to which it is most closely aligned.
For each T/S/S we compute the average <y> over all triples