TRECVID 2009 Guidelines

Guidelines for the TRECVID 2009 Evaluation

(last updated: )

0. Table of Contents:

Introduction
Video data
Data license agreements for active participants
System task details
Submissions and evaluations
Schedule
Outstanding 2009 guideline work items
Information for active participants
Contacts

1. Introduction:

The main goal of the TREC Video Retrieval Evaluation (TRECVID) is to promote progress in content-based analysis of and retrieval from digital video via open, metrics-based evaluation. TRECVID is a laboratory-style evaluation that attempts to model real world situations or significant component tasks involved in such situations.

In 2006 TRECVID completed the second two-year cycle devoted to automatic segmentation, indexing, and content-based retrieval of digital video - broadcast news in English, Arabic, and Chinese. It also completed two years of pilot studies on exploitation of unedited video (rushes). Some 70 research groups have been provided with the TRECVID 2005-2006 broadcast news video and many resources created by NIST and the TRECVID community are available for continued research on this data independent of TRECVID. See the "Past data" section of the TRECVID website for pointers.

In 2007 TRECVID began exploring new data (cultural, news magazine, documentary, and education programming) and an additional, new task - video rushes summarization. In 2008 that work continued with the exception that the shot boundary detection task was retired and copy detection and surveillance event detection were added.

TRECVID 2009 will test systems on the following tasks:

surveillance event detection
high-level feature extraction
search (interactive, manually-assisted, and/or fully automatic)
content-based copy detection

Changes to be noted by 2008 participants:

For the feature and search tasks:
- There will be an optional high-precision (at 10) condition in the search task.
- The number of search topics will be 24
- Training category B has been merged into A; b has been merged into a so that the main distinction is whether training data specific to Sound and Vision video has been used or not.
- A text-only run is no longer required. There is sufficient evidence for the advantage of using visual information over just text when dealing with the Sound and Vision video. Participants may however still be interested in search runs that begin with only the textual part of the query.
For the copy detection task:
- Runs on video-only and video+audio queries will be required; runs on audio-only queries will be optional.
- Runs for two application profiles will be required. One profile aims for elimination of false alarms and then optimization of misses and speed. The other weights the costs of false alarms and misses the same.
- Some of the extreme transformations may be dropped.

2. Video data:

A number of datasets are available for use in TRECVID 2009. We describe them here and then indicate below which data will be used for development versus test for each task.

Sound and Vision

The Netherlands Institute for Sound and Vision has generously provided news magazine, science news, news reports, documentaries, educational programming, and archival video in MPEG-1 for use within TRECVID.
For the search and feature tasks:
- In 2007 we used about 50 hours for development (tv7.sv.devel) and 50 hours for search and feature test (tv7.sv.test). These 100 hours will be available as development data for the search and feature tasks in 2009.
  The 100 hours used as test data for 2008 (tv8.sv.test) will be combined with 180 hours of new test data (tv9.sv.test) to create the 2009 test set for the search and feature tasks. This will allow us to retest for progress in detection of some of the 2008 features against some of the 2008 test data.
For the copy detection task:
- The test reference data for 2009 will comprise 180 hours of new data (tv9.sv.test) plus the 200 hours used in 2007 (tv7.sv.devel, tv7.sv.test) and 2008 (tv8.sv.test). The 200 hours used in 2007 (tv7.sv.devel, tv7.sv.test) and 2008 (tv8.sv.test) will also used for development in 2009. The overlap of development and test reference data, although unusual, was felt to be acceptible since queries are chosen at random from the reference data, all the reference data sets are from the same source, and nearly doubling the size of the test reference data was felt to be a valuable step toward more realism.
Distribution: by download from password-protected servers at NIST and elsewhere.
Training truth data for search and feature tasks:
- feature annotations of the search and feature task data in 2008, 2007, 2005, and 2003 are available from the "Past data" section of the TRECVID website.
- a community effort to annotate the 2009 development data for a set of features (yet to be determined) will likely be organized by Georges Quenot (LIG)
Master shot reference: Christian Petersohn at the Fraunhofer (Heinrich Hertz) Institute in Berlin has again provided the master shot reference. Please use the following reference in your papers:
```
C. Petersohn. "Fraunhofer HHI at TRECVID 2004: Shot Boundary Detection
System", TREC Video Retrieval Evaluation Online Proceedings, TRECVID,
2004 URL: www-nlpir.nist.gov/projects/tvpubs/tvpapers04/fraunhofer.pdf
```
Code developed by Peter Wilkins and Kirk Zhang at Dublin City University will be used to format the reference. The method used in 2005/6/8 and to be repeated with the data for 2009 is described here.
Automatic speech recognition (Dutch): The University of Twente has offered to provide the output of their automatic speech recognition system on the Sound and Vision data. Please use the following reference in your papers:
```
Marijn Huijbregts, Roeland Ordelman and Franciska de Jong, Annotation
of Heterogeneous Multimedia Content Using Automatic Speech
Recognition. in Proceedings of SAMT, December 5-7 2007, Genova, Italy
```
Machine translation (Dutch to English): Christof Monz of the Informatics Institute at the University of Amsterdam.will again contribute machine translation (Dutch to English) for the Sound and Vision video (ASR output or speech)
Keyframes: NIST will not be supplying keyframes for the Sound and Vision video. This will require groups to look afresh at how best to train their systems - tradeoffs between processing speed, effectiveness, amount of the video processed.
Restrictions on the use of development and test data: You must read this

BBC rushes

The BBC Archive has provided unedited material in MPEG-1 from about five dramatic series for use within TRECVID.
The BBC rushes used in 2008 for summarization system development (tv7.bbc.devel, tv7.bbc.test) and testing (tv8.bbc.test), a total of 53 hours, will be used for copy detection system development. 30 hours of new data (tv9.bbc.test) will be used for testing in the 2009 copy detection task.
Distribution: by download from password-protected servers at NIST and elsewhere.

TRECVID 2009 surveillance video

The data will consist of about 150 hours obtained from Gatwick Airport surveillance video data (courtesy of the UK Home Office). The Linguistic Data Consortium will provide event annotations for the entire corpus. The corpus will be divided into development and evaluation subsets.
Tasks: surveillance event detection.
Distribution: development data (2008 DevSet and 2008 EvalSet) by download from password-protected servers at NIST and mirror sites; new 2009 test data from UK Home Office. See here for details.
Training truth data: annotations for the 2008 development and test sets

MUSCLE-VCD-2007

For development data, the copy detection task can use the MUSCLE-VCD-2007 data. This is the data that was used for the copy detection evaluation at CIVR 2007. This data is NOT available from NIST.

3. Data license agreements for active participants

In order to be eligible to receive the data, you must have have applied for participation in TRECVID. Your application will be acknowledged by NIST with a team ID, and active participant's password, and information about how to obtain the data.

Then you will need to complete the relevant permission forms (from the active participant's area) and email a scanned image (jpg) of each page or a pdf file of the document to In your email include the following:

As Subject: "TRECVID data request"
In the body: your name
             your team ID (given to you when you apply to participate)
             the kinds of data you are requesting (BBC, S&V, and/or Gatwick)

If you cannot provide a jpg/pdf of the forms, please create a cover sheet (Attention: Lori Buckland) that identifies you, your team ID, your email address, and each kind of data (BBC, S&V, and/or Gatwick) you are requesting. Then fax all pages to fax
number in the US. If we are unable to read your jpg/pdf files we will request a faxed copy of your forms.

Please ask only for the test data (and optional development data) required for the task(s) you apply to participate in and intend to complete. One permission form will cover 2007, 2008, and 2009 BBC data. One permission form will cover 2007, 2008, and 2009 Sound and Vision data. One permission form will cover the 2008 London Gatwick data to be used for development in 2009. The 2009 London Gatwick test data will be handled separately.

Within a few days after the permission forms have been received, you will be emailed the access codes you need to download the data using the information about data servers in the the active participant's area.

Search and feature tasks
- Forms: Sound and Vision 2007-9
- Datasets
  - Development
    - tv7.sv.devel (32.9 GB)
    - tv7.sv.test (31.4 GB)
  - Test
    - tv8.sv.test (64.3 GB)
    - tv9.sv.test (114.8 GB)
Copy detection task
- Forms: Sound and Vision 2007-9; BBC 2007-9
- Datasets
  - Development
    - tv7.sv.devel (32.9 GB) (reference)
    - tv7.sv.test (31.4 GB) (reference)
    - tv8.sv.test (64.3 GB) (reference)
    - tv7.bbc.devel (12.2 GB) (non-reference)
    - tv7.bbc.test (10.9 GB) (non-reference)
    - tv8.bbc.test (10.8 GB) (non-reference)
  - Test
    - tv7.sv.devel (32.9 GB) (reference)
    - tv7.sv.test (31.4 GB) (reference)
    - tv8.sv.test (64.3 GB) (reference)
    - tv9.sv.test (114.8 GB) (reference)
    - tv7.bbc.devel (12.2 GB) (non-reference)
    - tv7.bbc.test (10.9 GB) (non-reference)
    - tv8.bbc.test (10.8 GB) (non-reference)
    - tv9.bbc.test (19.0 GB (non-reference)
Surveillance event detection task
- See here for details.
4. System task details:

4.1 Surveillance event detection:
The following tasks are proposed for 2009.
- Retrospective Event Detection: The task is to detect observations of events based on the event definition. Systems may process the full corpus using multiple passes prior to outputting a list of putative events observations. The primary condition for this task will be single-camera input (i.e., the camera views are processed independently). Multiple-camera input may optionally be run as an additional contrastive condition.
- "Free-Style" Exercise. The purpose of this exercise is to support innovation and exploration of event detection in ways not anticipated by the retrospective task. Freestyle participants must define tasks that are pertinent to the airport video surveillance domain and that can implemented on this data set. Freestyle submissions must include rationale, clear definitions of the task, performance measures, reference annotations and a baseline system implementation.
Further information about the tasks is available at the following web sites:
- Overview Tasks
4.2 High-level feature extraction:

Various high-level semantic features, concepts such as "Indoor/Outdoor", "People", "Speech" etc., occur frequently in video databases. The proposed task will contribute to work on a benchmark for evaluating the effectiveness of detection methods for semantic concepts
The task is as follows: given the feature test collection, the common shot boundary reference for the feature extraction test collection, and the list of feature definitions, participants will return for each feature the list of at most 2000 shot IDs from the test collection, ranked according to the highest possibility of detecting the presence of the feature. Each feature is assumed to be binary, i.e., it is either present or absent in the given reference shot.

All feature detection submissions will be made available to all participants for use in the search task - unless the submitter explicitly asks NIST before submission not to do this.

Description of high-level features to be detected:

The descriptions are those used in the common annotation effort. They are meant for humans, e.g., assessors/annotators creating truth data and system developers attempting to automate feature detection. They are not meant to indicate how automatic detection should be achieved.

If the feature is true for some frame (sequence) within the shot, then it is true for the shot; and vice versa. This is a simplification adopted for the benefits it affords in pooling of results and approximating the basis for calculating recall.

NOTE: In the following, "contains x" is short for "contains x to a degree sufficient for x to be recognizable as x to a human" . This means among other things that unless explicitly stated, partial visibility or audibility may suffice.

NOTE: NIST will instruct the assessors during the manual evaluation of the feature task submissions as follows. The fact that a segment contains video of physical objects representing the topic target, such as photos, paintings, models, or toy versions of the topic target, will NOT be grounds for judging the feature to be true for the segment. Containing video of the target within video may be grounds for doing so.

Selection of high-level features to be detected:

In 2009, participants in the high-level feature task must submit results for 20 features.
For 2009, 10 features from those tested in 2008 will be added to 10 new features selected based on suggestions from the participants in the 2009 high-level feature task during the month of March. A feature must be moderately frequent (not too rare (>100) and not too frequent (<500) in the development collection of ~200 hours), have a clear definition to support annotation and results assessment, and conceivably be of use in searching. Also, we want to avoid overlap with previously used topics/features.

4.3 Search:

Search is high-level task which includes at least query-based retrieval and browsing. The search task models that of an intelligence analyst or analogous worker, who is looking for segments of video containing persons, objects, events, locations, etc. of interest. These persons, objects, etc. may be peripheral or accidental to the original subject of the video.

The task is as follows: given the search test collection, a multimedia statement of information need (topic), and the common shot boundary reference for the search test collection, return a ranked list of at most N common reference shots from the test collection, which best satisfy the need. For the standard search task, N = 1000. For the high-precision search task N = 10.

NOTE: In the topics, "contains x" is short for "contains x to a degree sufficient for x to be recognizable as x to a human" . This means among other things that unless explicitly stated, partial visibility or audibility may suffice.

NOTE: The fact that a segment contains video of physical objects representing the topic target, such as photos, paintings, models, or toy versions of the topic target, will NOT be grounds for judging the feature to be true for the segment. Containing video of the target within video may be grounds for doing so.

NOTE: When a topic expresses the need for x and y and ..., all of these (x and y and ...) must be perceivable simultaneously in one or more frames of a shot in order for the shot to be considered as meeting the need.

Please note the following restrictions for both forms for the search task:

TRECVID 2009 will accept fully automatic search submissions (no human input in the loop) as well as manually-assisted and interactive submissions as illustrated graphically below

Every submitted run must contain a result set for each topic.

In order to maximize comparability within and across participating groups, all manually-assisted runs within any given site must be carried out by the same person.

An interactive run will contain one result for each and every topic, each such result using the same system variant. Each result for a topic can come from only one searcher, but the same searcher does not need to be used for all topics in a run. Here are some suggestions for interactive experiments.

The searcher should have no experience of the topics beyond the general world knowledge of an educated adult.

The search system cannot be trained, pre-configured, or otherwise tuned to the topics.

The maximum total elapsed time limit for each topic (from the time the searcher sees the topic until the time the final result set for that topic is returned) in an interactive search run will be 10 minutes. For manually-assisted runs the manual effort (topic to query translation) for any given topic will be limited to 10 minutes.

All groups submitting search runs must include the actual elapsed time spent as defined in the videoSearchRunResult.dtd.

Groups carrying out interactive runs should measure user characteristics and satisfaction as well and report this with their results, but they need not submit this information to NIST. Here is some information about the questionnaires the Dublin City University team used in 2004 to collect search feedback and demographics from all groups doing interactive searching. Something similar will be done again this year, with details to be determined once participation is known.

In general, groups are reminded to use good experimental design principles. These include among other things, randomizing the order in which topics are searched for each run so as to balance learning effects.

Supplemental interactive search runs, i.e., runs which do not contribute to the pools but are evaluated by NIST, will be allowed to enable groups to fill out an experimental design. Such runs must not be mixed in the same submission file with non-supplemental runs. This is the only sort of supplemental run that will be accepted.

4.4 Content-based copy detection:

As used here, a copy is a segment of video derived from another video, usually by means of various transformations such as addition, deletion, modification (of aspect, color, contrast, encoding, ...), camcording, etc. Detecting copies is important for copyright control, business intelligence and advertisement tracking, law enforcement investigations, etc. Content-based copy detection offers an alternative to watermarking. The TRECVID copy detection task will be will be based on the framework tested in TRECVID 2008, which used the CIVR 2007 Muscle benchmark.

Videos often contain audio. Sometimes the original audio is retained in the copied material, sometimes it is replaced by a new soundtrack. Nevertheless, audio is an important and strong feature for some application scenarios of video copy detection. Since detection of untransformed audio copies is relatively easy and the primary interest of the TV community is in video analysis, it was decided to model the required CD tasks with video-only and video+audio queries. However, since audio is of importance for practical applications, there will be an additional optional task using transformed audio-only queries.

Required tasks

For 2009, we plan to require each group to submit at least two runs using video-only queries and two using audio+video queries. Submissions using audio-only queries will be optional. For each kind of run, there will be two application profiles to consider. One will aim to reduce the false alarm rate to 0 and then optimize the probability of miss and the speed. The second will set an equal cost for false alarms and misses. Thus a minimum of 4 runs will be required. Runs on audio-only queries will be optional.

The required system tasks will be as follows: given a test collection of videos and a set of queries, determine for each query the place, if any, that some part of the query occurs, with possible transformations, in the test collection. The set of possible transformations will be based to the extent possible on actually occurring transformations.

Each query will be constructed using tools developed by IMEDIA to include some randomization at various decision points in the construction of the query set. Some manual procedures (e.g. for the camcording transformation) were used in 2008. The automatic tools developed by IMEDIA for TRECVID 2008 are available for download. For each video-only query, the tools will take a segment from the test collection, optionally transform it, embed it in some video segment which does not occur in the test collection, and then finally apply a video transformation the entire query segment. Some queries may contain no test segment; others may be composed entirely of the test segment. Video transformations used in 2008 are documented in the general plan for query creation. and in the final video transformations document with examples.. For 2009 we will use a subset of the 2008 video transformation listed here:

T2: Picture in picture Type 1 (The original video is inserted in front of a background video)
T3: Insertions of pattern
T4: Strong reencoding
T5: Change of gamma
T6: Decrease in quality -- This includes choosing randomly 3 transformations from the following: Blur, change of gamma (T4), frame dropping, contrast, compression (T3), ratio, white noise
T8: Post production -- This includes choosing randomly 3 transformations from the following: Crop, Shift, Contrast, caption (text insertion), flip (vertical mirroring), Insertion of pattern(T2), Picture in Picture type 2 (the original video is in the background)
T10: change to randomly choose 1 transformation from each of the 3 main categories.

Also we noticed that in some 2008 queries where both "insertion of pattern" and "text insertion" were selected, it made the video extremely crowded, so in 2009 we will try to either decrease the font of the text or choose relatively small patterns.

The audio-only queries will be generated along the same lines as the video-only queries: a set of base audio-only queries is transformed by several techniques that are intended to be typical of those that would occur in real reuse scenarios: (1) bandwidth limitation (2) other coding-related distortion (e.g. subband quantization noise) (3) variable mixing with unrelated audio content. The audio transformations used in 2009 are documented here. The transformed queries will be downloadable from NIST.

The audio+video queries will consist of the aligned versions of transformed audio and video queries, i.e, they will be various combinations of transformed audio and transformed video from a given base audio+video query. In this way sites can study the effectiveness of their systems for individual audio and video transformations and their combinations. These queries will not be downloadable. Rather, NIST will provide a list of how to construct each audio+video test query so that given the audio-only queries and the video-only queries, sites can use a tool such as ffmpeg to construct the audio+video queries.

5. Submissions and Evaluations:

Please note: Only submissions which are valid when checked against the supplied DTDs will be accepted. You must check your submission before submitting it. NIST reserves the right to reject any submission which does not parse correctly against the provided DTD(s). Various checkers exist, e.g., Xerces-J: java sax.SAXCount -v YourSubmision.xml.

The results of the evaluation will be made available to attendees at the TRECVID workshop and will be published in the final proceedings and/or on the TRECVID website within six months after the workshop. All submissions will likewise be available to interested researchers via the TRECVID website within six months of the workshop.

5.1 Surveillance event detection

Submissions

Information on submissions may be found here.

Evaluation

Information about the evaluation and the measures is available.

5.2 High-level feature extraction

Submissions

The maximum number of runs each group can submit will be 6. All runs must be prioritized and all will be evaluated.

For each feature in a run, participants will return at most 2000 shot IDs.

Here is a DTD for feature extraction results of one run, one for results from multiple runs, and a small example of what a site would send to NIST for evaluation. Please check your submission to see that it is well-formed
Submissions will be transmitted to NIST via a webpage. Details will be available well before the submission deadline.
Each run must contain results for all features listed above

Evaluation

In 2009 the test dataset will be composed of 100 hours of 2008 test data and 180 hours of new data. The test features will be composed of 10 from 2008 and 10 new ones. We will also evaluate all 20 features against the entire 2009 test dataset to produce the official results.
We will also evaluate the 10 features from 2008 against the 100 hours of 2008 test data. Those results can be compared to the results for those features in 2008.
The unit of testing and performance assessment will be the video shot as defined by the track's common shot boundary reference. The submitted ranked shot lists for the detection of each feature will be judged manually as follows. We will take all shots down to some fixed depth (in ranked order) from the submissions for a given feature - using some fixed number of runs from each group in priority sequence up to the median of the number of runs submitted by any group. We will then merge the resulting lists and create a list of unique shots. These will be judged manually down to some depth to be determined by NIST based on available assessor time and number of correct shots found. NIST will maximize the number of shots judged within practical limits. We will then evaluate each submission to its full depth based on the results of assessing the merged subsets. This process will be repeated for each feature.
If the feature is perceivable by the assessor for some frame (sequence) however short or long then, then we'll assess it as true; otherwise false. We'll rely on the complex thresholds built into the human perceptual systems. Search and feature extraction applications are likely - ultimately - to face the complex judgment of a human with whatever variability is inherent in that.
Runs will be compared based on a sample of the submission pools. Precision-recall curves based on the sample will be used as well as inferred average precision, which provides a good estimate of average precision - a single-valued combination of precision, recall, and ranking ability.

5.3 Search

Submissions

The maximum number of runs each group can submit will be 10. All runs must be prioritized and all will be evaluated.

Each interactive run will contain one result for each and every topic using the system variant for that run. Each result for a topic can come from only one searcher, but the same searcher does not need to be used for all topics in a run. If a site has more than one searcher's result for a given topic and system variant, it will be up to the site to determine which searcher's result is included in the submitted result. NIST will try to make provision for the evaluation of supplemental results, i.e., ones NOT chosen for the submission described above. Details on this will be available by the time the topics are released.

In the standard search task, for each topic in a run, participants will return the list of at most 1000 shot IDs.

In the high-precision task, for each topic in a run, participants will return the list of at most 10 shot IDs.

Here is a

DTD for search results of one run

results from multiple runs

small example

Submissions will be transmitted to NIST via a webpage. Details will be available well before the submission deadline.

Evaluation

The unit of testing and performance assessment will be the video shot as defined by the track's common shot boundary reference. The submitted ranked lists of shots found relevant to a given topic will be judged manually as follows. We will take all shots down to some fixed depth (in ranked order) from the submissions for a given topic - using some fixed number of runs from each group in priority sequence up to the median of the number of runs submitted by any group. We will then merge the resulting lists and create a list of unique shots. A random sample of these pools will be taken and this sample will be judged manually to some depth to be determined by NIST based on available assessor time and number of correct shots found. NIST will maximize the number of shots judged within practical limits. We will then evaluate each submission to its full depth based on the results of assessing the merged subsets. This process will be repeated for each topic.

The measures will be the same for both forms of the search task except that the average precision will be calculated using the lesser of the number of known relevant or the size of the result set (10), which in the high-precision task for almost all topics is expected to be the size of the result set. See Webber et al for a discussion of using average precision versus precision at 10 (P@10).

Per-search measures:

average precision (definition below)
elapsed time (for all runs)

Per-run measure:

mean average precision:

average precision

mean average precision (MAP)

TREC-10 Proceedings appendix on common evaluation measures

5.4 Content-based copy detection

Submissions

Information on submission of copy detection runs has been collected in a separate document.

Evaluation

Information on evaluation of copy detection runs has been collected in a separate document.

6. Schedule:

The following are the target dates for 2009:

2. Feb

NIST sends out Call for Participation in TRECVID 2009

27. Feb

Applications for participation in TRECVID 2009 due at NIST

Mar

Event detection details worked out

Development and Evaluation data specified
Release of video data, required event definitions, and examples
Final evaluation plan & guidelines written
Release scoring tool
Development annotations for required events released

1 Mar

Final versions of TRECVID 2008 papers due at NIST

3. Apr

Guidelines complete

March

Download of feature/search development data

April

Download of copy detection development data

May

Download of copy detection test data

June

Download of feature/search test data

30. June

Video-only copy detection queries available for download

Jul

Event detection run (systems run on Dev data)

3. Aug

Video-only copy detection submissions due at NIST for evaluation
Audio-only copy detection queries available for download

7. Aug

Search topics available from TRECVID website.

10. Aug

Feature extraction tasks submissions due at NIST for evaluation.

17. Aug

Unevaluated feature extraction submissions available for active participants

18. Aug - 9. Sep

Feature assessment at NIST

28. Aug

Audio-only copy detection submissions due at NIST
Audio+video copy detection query plans available for download

Sep

Event detection submissions due at NIST for formal evaluation

9. Sep

Search task submissions due at NIST for evaluation

17. Sep

Results of feature extraction evaluations returned to participants

17. Sep - 9. Oct

Search assessment at NIST

1. Oct

Audio+video copy detection submissions due at NIST for evaluation
Video-only and audio-only copy detection results returned to participants
Event detection preliminary results returned to participants
TRECVID Workshop information and registration

9. Oct

Audio+video copy detection results returned to participants

10. Oct

Event detection final results returned to participants

14. Oct

Results of search evaluations returned to participants

14. Oct

Results of search evaluations returned to participants

19. Oct

Speaker proposals due at NIST

23. Oct

Notebook papers due at NIST

1. Nov

9. Nov

TRECVID 2009 Workshop registration closes

16,17 Nov

TRECVID Workshop at NIST in Gaithersburg, MD

18. Dec

Workshop papers publicly available (slides added as they arrive)

----------------

1. Mar 2010

Final versions of TRECVID 2009 papers due at NIST

7. Outstanding 2009 guideline work items

Here is a list of work items that must be completed before the guidelines are considered to be final..

[DONE} Find volunteers to host mirror copy of data
[DONE} Determine set of 10 new features for testing in 2009.
[DONE] Determine which copy detection transformations to drop.
[DONE} Determine if current copy detection query creation tools are completely automatic.
[DONE: Increased size of test reference data instead] Find ways to increase the number of similar video clips (necessary to get a better estimate of Pfa) for copy detection.

8. Information for active participants

9. Contacts:

Coordinators:

Alan Smeaton (CLARITY: Centre for Sensor Web Technologies, Dublin City University)

(TNO, Radboud University Nijmegen)

NIST contact:

Email lists:

Information and discussion for active workshop participants

[email protected]
archive open to active participants only
NIST will subscribe the contact listed in your application to participate when we have received it. Additional members of active participant teams will be subscribed by NIST if they send email to indicating they want to be subscribed, the email address to use, their name, and providing the TRECVID 2009 active participant's password. Groups may combine the information for multiple team members in one email.
Once subscribed, you can post to this list by sending you thoughts as email to [email protected], where they will be sent out to EVERYONE subscribed to the list, i.e., all the other active participants.

Information and discussion on the surveillance event detection task

[email protected]
If you would like to subscribe, see the event detection webpage for contact information.

National Institute of
Standards and Technology Home

Last updated:
Date created: Tuesday, 14-Jan-09
For further information contact

Guidelines for the TRECVID 2009 Evaluation

0. Table of Contents:

Sound and Vision

BBC rushes

TRECVID 2009 surveillance video

MUSCLE-VCD-2007

Description of high-level features to be detected:

Selection of high-level features to be detected:

Required tasks

Submissions

Evaluation

Submissions

Evaluation

Submissions

Evaluation

Submissions

Evaluation