Each participating group is responsible for adhering to the letter and spirit of these rules, the intent of which is to make the TRECVID evaluation realistic, fair, and maximally informative about system effectiveness as opposed to other confounding effects on performance. Submissions, which in the judgment of the coordinators and NIST do not comply, will not be accepted.
The test data cannot be used for system development and system developers should have no knowledge of it until after they have submitted their results for evaluation to NIST. Depending on the size of the team and tasks undertaken, this may mean isolating certain team members from certain information or operations, freezing system development early, etc.
Participants may use donated semantic indexing output from the test collection but incorporation of such features should be automatic so that system development is not affected by knowledge of the extracted features. Anyone doing searches must be isolated from knowledge of that output.
The development data is intended for the participants' use in developing their systems. It is up to the participants how the development data is used, e.g., divided into training and validation data, etc.
Participants may use other development resources not excluded in these guidelines. Such resources must be reported at the workshop. Note that use of other resources will change the submission's status with respect to system development type, which is described next.
In order to help isolate system development as a factor in system performance each semantic indexing task submission, or donation of extracted features must declare its training type:
As the name "no annotation" indicates, for the categories E and F, no manual annotation should be done on the automatically collected data; automatic processing is allowed and encouraged but data should be processed blindly.
Please note a change to a stricter interpretation of the following
categories: all data used for training at any level of any system
component should be considered. This means that even just the use of
something like a face detector that was trained on non-IACC training
data would disqualify the run as type "A". This implies that some
systems accepted in category A in the previous years will be placed in
categories B, C or D with the new and more strict rules.
While the categories will be taken more strictly than in the previous
years, they will be used only as an information for clarifying what is
done by the participants. They will not be used for presenting the
results in different tables and figures; there will be only one global
ranking and one global plot in which all systems will be gathered,
they will be just tagged (as previously) by the category as part
of the run named generated at NIST.
While the categories will be taken more strictly than in the previous years, they will be used only as an information for clarifying what is done by the participants. They will not be used for presenting the results in different tables and figures; there will be only one global ranking and one global plot in which all systems will be gathered, they will be just tagged (as previously) by the category as part of the run named generated at NIST.
We encourage groups to submit at least one pair of runs from their allowable total that helps the community understand how well systems trained on non-IACC data generalize to IACC test data.