TRECVID 2024 Video Data Schedule Contacts Active Participants Attending TRECVID Workshop

Video Data

A number of datasets are available for use in TRECVID 2024 and are described below.

  • Once you know which tasks you will be participating in, you can determine which data sets you need.
  • Then for each needed dataset, see below for information on how you get permission to use the data and how it will be distributed.
  • Please request only for the test data (and optional development data) required for the task(s) you apply to participate in and intend to complete.

Vimeo Creative Commons Collection 1 (V3C1)

    This dataset supports the Ad-hoc video search (AVS) task as training dataset. The V3C1 dataset (drawn from a larger V3C video dataset) is composed of 7475 Vimeo videos (1.3 TB, 1000 h) with Creative Commons licenses and mean duration of 8 min. All videos will have some metadata available e.g., title, keywords, and description in json files. The dataset has been segmented into 1,082,659 short video segments according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.

  • The master shot reference for the V3C1 dataset can be downloaded from here with a readme file
  • Speech transcripts for the V3C1 dataset are also available online from this Zenodo repository generated using OpenAI's Whisper.
  • A provided analysis to the dataset characteristics is available from our external resources page
  • Data use agreements and Distribution: See Data use agreements for download instructions for active participants from NIST/mirror servers and from ITEC university.

    Raw V3C1 dataset including metadata will be available for download from servers of ITEC - Institute of Information Technology. While segmented shots from raw videos will be available to download from NIST. Information about downloading the V3C1 from ITEC university can be obtained from the active participants tv24 data servers file

Vimeo Creative Commons Collection 2 (V3C2)

    This dataset supports the Ad-hoc video search (AVS) task as testing dataset. The V3C2 dataset (drawn from a larger V3C video dataset) is composed of 9760 Vimeo videos (1.6 TB, 1300 h) with Creative Commons licenses and mean duration of 8 min. All videos will have some metadata available e.g., title, keywords, and description in json files. The dataset has been segmented into 1,425,454 short video segments according to the provided master shot boundary files. In addition, Keyframes and thumbnails per video segment have been extracted and available.

  • The master shot reference for the V3C2 dataset can be downloaded from here with a readme file
  • Speech transcripts for the V3C dataset (includes V3C1, VeC2, and V3C3) are also available online from this Zenodo repository generated using OpenAI's Whisper.
  • A provided analysis to the dataset characteristics is available from our external resources page

    Data use agreements and Distribution: See Data use agreements for download instructions for active participants from NIST/mirror servers and from ITEC university.

    Raw V3C2 dataset including metadata is available for download from servers of ITEC - Institute of Information Technology. While segmented shots from raw videos will be available to download from NIST. Information about downloading the V3C2 from ITEC university can be obtained from the active participants tv24 data servers file

TV_VTT Training dataset

    This dataset will support the training dataset for the Video-to-Text (VTT) task. It contains short videos (ranging from 3 seconds to 10 seconds) from TRECVID VTT task from 2016 to 2022. There are 12,870 videos with captions. Each video has between 2 and 5 captions, which have been written by dedicated annotators. The dataset is available from here after submitting the data agreement form (see below).

    Data use agreements and Distribution: See Data use agreements for download instructions for active participants from NIST.

Vimeo Creative Commons Collection 3 (V3C3) -- VTT Task specific

    NIST will be using a small subset of videos from V3C3 for the VTT task testing dataset. Please consult the general schedule for data release and submission of results dates.

    Data use agreements and Distribution: See Data use agreements for download instructions for active participants from NIST servers.

MedVidQA Collections

    The VCVAL (Video Corpus Visual Answer Localization) task is supported by MedVidQA collections training dataset consisting of 3,010 human-annotated instructional questions and visual answers from 900 health-related videos. In addition, an automatically created HealthVidQA dataset consists of ~50 000 instructional questions and visual answers from 15,000 health-related videos. A validation dataset consisting of 50 questions and their answer timestamps created from 25 medical instructional videos will also be available. Finally, the testing dataset will contains 50 questions and their answer timestamps created from 25 medical instructional videos.

    The MIQG (Medical Instructional Question Generation) task is supported by a training dataset consists of 2710 question and visual segments, which are formulated from 800 medical instructional videos from the MedVidQA collections. The provided validation dataset will contain 145 questions and answers timestamps created from 49 medical instructional videos, while the test dataset will contain 100 questions and answers timestamps created from 45 medical instructional videos.

    For data download instructions, please refer to the task guidelines page HERE Training, Validation, and testing data and topics will be available according to the published schedule.

CCU Videos

    The CCU task will use video datasets of people in some type of conversations with each other in Mandarin Chinese. Development set of about 1200 recordings of various durations will be used with only about 5 minutes of annotation available for each recording. A pilot evaluation (aka dry run) will be conducted with about 1200 recordings will be given to participants for additional development before the formal evaluation. During the formal evaluation, a set of about 3000 recordings will be employed as the testing dataset.

Additional data

IACC.3

    The IACC.3 dataset is approximately 4600 Internet Archive videos (144 GB, 600 h) with Creative Commons licenses in MPEG-4/H.264 format with duration ranging from 6.5 min to 9.5 min and a mean duration of almost 7.8 min. Most videos will have some metadata provided by the donor available e.g., title, keywords, and description.

    Data use agreements and Distribution: Download for active participants from NIST/mirror servers. See Data use agreements

    Master shot reference, Automatic speech recognition (for English), and ground truth (used between 2016-2017): Available by download from the TRECVID Past Data page

IACC.2.A-C

    Three datasets (A,B,C) - totaling approximately 7300 Internet Archive videos (144 GB, 600 h) with Creative Commons licenses in MPEG-4/H.264 format with duration ranging from 10 s to 6.4 min and a mean duration of almost 5 min. Most videos will have some metadata provided by the donor available e.g., title, keywords, and description.

    NOTE: Be sure to reload the relevant collection.xml files (A, B, C) in the master shot reference and remove files with a "use" attribute set to "dropped" - these are no longer available under a Creative Commons license and are not part of the test collection.

    Data use agreements and Distribution: Download for active participants from NIST/mirror servers. See Data use agreements

    Master shot reference, Automatic speech recognition (for English), and ground truth (used between 2013-2015): Available by download from the TRECVID Past Data page

IACC.1.A-C

    Three datasets (A,B,C) - totaling approximately 8000 Internet Archive videos (160 GB, 600 h) with Creative Commons licenses in MPEG-4/H.264 format with duration between 10s and 3.5 min. Most videos will have some metadata provided by the donor available e.g., title, keywords, and description

    Data use agreements and Distribution: Available by download from the Internet Archive. See TRECVID Past Data page. Or download from the copy on the Dublin City University server, but use the collection.xml files (see TRECVID past data page) for instructions on how to check the current availability of each file.

    Master shot reference, Automatic speech recognition (for English), and ground truth (used between 2010-2012): Available by download from the TRECVID Past Data page

IACC.1.tv10.training

    Approximately 3200 Internet Archive videos (50 GB, 200 h) with Creative Commons licenses in MPEG-4/H.264 format with durations between 3.6 and 4.1 min Most videos will have some metadata provided by the donor available e.g., title, keywords, and description

    Data use agreements and Distribution: Available by download from the Internet Archive. See TRECVID Past Data page. Or download from the copy (see tv2010 directory) on the Dublin City University server, but use the collection.xml files (see TRECVID past data page) for instructions on how to check the current availability of each file.

    Master shot reference: Available by download from the TRECVID Past Data page

    Common feature annotation: Available by download from the TRECVID Past Data page

    Automatic speech recognition (for English): Available by download from the TRECVID Past Data page


Data use agreements handled by NIST

    In order to be eligible to receive the data, you must have applied for participation in TRECVID. Your application will be acknowledged by NIST with a team ID, active participant's password, and information about how to obtain the data.

  • If you will be using IACC.1 video, the data use agreements are available from the "Past data" webpage. You will be downloading the data from the Dublin City University server (see above) or the Internet Archive. See the "Data Use Agreements and Distribution" section for IACC.1

  • If you will be needing to get a copy of IACC.2, IACC.3, V3C1, or V3C2 data you will need to complete the relevant permission forms (from the active participant's area (select the correct dataset name folder)) and email the scanned page images for each form as one Adobe Acrobat pdf of the document to Angela Ellis.

  • If you need access to the V3C data for VTT (TV_VTT for training or V3C3 for testing), you will need to complete the relevant permission form (from the active participant's area) and email the scanned page as Adobe Acrobat pdf to George Awad. NOTE: If you already filled this V3C data agreement form before, you don't need to submit another form.

  • Note that if you signed the permission form last year for IACC.2, IACC.3, V3C1, or V3C2 and do not need to replace your original copy then you do not need to submit another permission form this year.

    In your email include the following:

    As Subject: "TRECVID data request"
    In the body: your name
                 your short team ID (given when you applied to participate)
                 the kinds of data you will be using - one or more of the following:
                 IACC.2, IACC.3, V3C1, V3C2, etc
    
    You will receive instructions on how to download the data.

Requests are handled in the order they are received. Please allow 5 business days for NIST to respond to your request. To download the IACC, or V3C data you need to use the access codes sent to you by email and the information about data servers urls in the active participant's area.