TRECVID Contact Citations Task Proposals ActEV Sequestered Data Leaderboard

TRECVID Data Availability

Various types of data have been involved in the TRECVID workshops and the availability of these data sets varies by type and year as listed below.

Index by year the data was used:


Evaluated system submissions

  • These are available in the protected Past Results section of the TREC website under "TREC 20XX" (e.g. TREC 2018, TREC 2017, etc). Access to this area can be requested by contacting the TREC Program Manager

TRECVID 2023

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009
The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is publicly available:

  • Internet Archive videos (IACC.3) under Creative Commons licenses (used as development data since 2019). The collection.xml file (load slowly) contains urls for the Internet Archive videos for use in TRECVID 2016-2018. The comment at the top of the collection file tells how to construct the urls for downloading the videos.
  • master shot boundary reference for Ad-hoc IACC.3 test data
  • ASR output on IACC.3 videos

  • The Vimeo Creative Commons (V3C1) dataset under Creative Commons licenses (used as testing data between 2019 - 2021).
    In total, there is 7475 Vimeo videos (1.3 TB, 1000 h) with mean duration of 8 min. All videos have some metadata available
    e.g., title, keywords, and description in json files. The dataset has been segmented into 1,082,657 short video segments
    according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.
    To download the data as segmented videos please submit a signed data agreement to Angela Ellis. To download the raw whole videos from ITEC University servers please indicate so in your email submitting the form.
  • The master shot reference for the V3C1 dataset can be downloaded from here with a readme file

  • The Vimeo Creative Commons (V3C2) dataset under Creative Commons licenses (used as testing data starting in 2023).
    In total, there is 9760 Vimeo videos (1.6 TB, 1300 h) with mean duration of 8 min. All videos have some metadata available
    e.g., title, keywords, and description in json files. The dataset has been segmented into 1,425,454 short video segments
    according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.
    To download the data as segmented videos please submit a signed data agreement to Angela Ellis. To download the raw whole videos from ITEC University servers please indicate so in your email submitting the form.
  • The master shot reference for the V3C2 dataset can be downloaded from here with a readme file

  • The Kinolorber movies dataset (MCOLL) is composed of 10 licensed movies used in the Deep Video Understanding (DVU) task. The data is available after submitting a data agreement form found HERE.
  • The Deep Video Understanding Movies Training Dataset is a set of 14 Creative Common (CC) movies (total duration of 17.5 hr) previously utilized between 2020 and 2023 ACM Multimedia DVU Grand Challenges and TRECVID DVU including their movie-level and scene-level annotations. The movies have been collected from public websites such as Vimeo and the Internet Archive. In total, the 14 movies consist of 621 scenes, 1572 entities, 650 relationships, and 2491 interactions. The development dataset can be accessed from this URL. Please consult the included documentation folder readme files which explains how the dataset is organized.

  • The Video to Text Description (VTT) main task and robustness sub-task testing datasets (V3C3 video shots) is available here

  • Ground truth data created at/for NIST

TRECVID 2022

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009
The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is publicly available:

  • Internet Archive videos (IACC.3) under Creative Commons licenses (used as development data since 2019). The collection.xml file (load slowly) contains urls for the Internet Archive videos for use in TRECVID 2016-2018. The comment at the top of the collection file tells how to construct the urls for downloading the videos.
  • master shot boundary reference for Ad-hoc IACC.3 test data
  • ASR output on IACC.3 videos

  • The Vimeo Creative Commons (V3C1) dataset under Creative Commons licenses (used as testing data between 2019 - 2021).
    In total, there is 7475 Vimeo videos (1.3 TB, 1000 h) with mean duration of 8 min. All videos have some metadata available
    e.g., title, keywords, and description in json files. The dataset has been segmented into 1,082,657 short video segments
    according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.
    To download the data as segmented videos please submit a signed data agreement to Angela Ellis. To download the raw whole videos from ITEC University servers please indicate so in your email submitting the form.
  • The master shot reference for the V3C1 dataset can be downloaded from here with a readme file

  • The Vimeo Creative Commons (V3C2) dataset under Creative Commons licenses (used as testing data starting in 2022).
    In total, there is 9760 Vimeo videos (1.6 TB, 1300 h) with mean duration of 8 min. All videos have some metadata available
    e.g., title, keywords, and description in json files. The dataset has been segmented into 1,425,454 short video segments
    according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.
    To download the data as segmented videos please submit a signed data agreement to Angela Ellis. To download the raw whole videos from ITEC University servers please indicate so in your email submitting the form.
  • The master shot reference for the V3C2 dataset can be downloaded from here with a readme file

  • The Kinolorber movies dataset (MCOLL) is composed of 10 licensed movies and has been used in the Movie Summarization (MSUM) task and the Deep Video Understanding (DVU) task. The data is available after submitting a data agreement form found HERE. The msum testing queries can be downloaded from here
  • .
  • The Deep Video Understanding Movies Training Dataset is a set of 14 Creative Common (CC) movies (total duration of 17.5 hr) previously utilized in 2020 and 2021 ACM Multimedia DVU Grand Challenges including their movie-level and scene-level annotations. The movies have been collected from public websites such as Vimeo and the Internet Archive. In total, the 14 movies consist of 621 scenes, 1572 entities, 650 relationships, and 2491 interactions. The development dataset can be accessed from this URL. Please consult the included documentation folder readme files which explains how the dataset is organized.

  • The Video to Text Description (VTT) testing data (V3C1 video shots) is available here
  • The Disaster Scene Description and Indexing (DSDI) task 6hr testing video shots are available for download. the shots have been collected from various natural disaster events in the USA through FEMA (Federal Emergency Management Agency).

  • Ground truth data created at/for NIST

TRECVID 2021

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection

The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is publicly available:

TRECVID 2020

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection

The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is publicly available:

TRECVID 2019

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:

TRECVID 2018

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection (through Linguistic Data Consortium (LDC))

The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:

TRECVID 2017

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection

The following data is available to Non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2016

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2015

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2014

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2013

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2007-9 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2012

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2009 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2011

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2009 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2010

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2009 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
  • HAVIC dataset for multimedia event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2009

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2009 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2009

The following data is only available from NIST for current TRECVID participants:

  • BBC rushes video files (MPEG-1)
  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2008

The following data is available for multimedia research only, under restrictions agreed to with the Netherlands Institute for Sound and Vision. The data can be requested by filling out and returning the forms available here.

  • TRECVID 2008 Sound and Vision video files (MPEG-1)
  • LIG keyframes distributed in 2008

The following data is available from NIST for current TRECVID participants:

  • BBC rushes video files (MPEG-1)
  • Development dataset annotations for surveillance event detection
  • Test dataset annotations for surveillance event detection
The following data is available to non-TRECVID participants as noted below:

  • London Gatwick surveillance video files
  • The i-LIDS Multiple Camera Tracking Scenario Training data set is the original source of the evaluation set for TRECVID SED 2009, 2010, and 2011. It is available for non-TRECVID participants through the standard i-LIDS licensing process

The following data is publicly available:


TRECVID 2004

All or part of the following data may be available for purchace from the Linguistic Data Consortium. Check the LDC catalog. NIST is not able to make the full video available.

  • video files (MPEG-1)
  • master keyframes selected by CLIPS-IMAG
  • ASR output created by LIMSI
  • truth data created at LDC
    • story boundary annotation

The following data is publicly available:


TRECVID 2003

All or part of the following data may be available for purchace from the Linguistic Data Consortium. Check the LDC catalog. NIST is not able to make the full video available.

The following data is publicly available:


TREC 2002 Video Track

TREC 2001 Video Track

Digital Video Retrieval at NIST

Digital Video Retrieval at NIST
News magazine, science news, news reports, documentaries, educational programming, and archival video

Digital Video Retrieval at NIST
TV Episodes

Digital Video Retrieval at NIST
Airport Security Cameras & Activity Detection

Digital Video Retrieval at NIST
Video collections from News, Sound & Vision, Internet Archive,
Social Media, BBC Eastenders