The main goal of the TREC Video Retrieval Evaluation (TRECVID) is to promote progress in content-based analysis of and retrieval from digital video via open, metrics-based evaluation. TRECVID is a laboratory-style evaluation that attempts to model real world situations or significant component tasks involved in such situations.
Up until 2010, TRECVID used test data from a small number of known professional sources - broadcast news organizations, TV program producers, and surveillance systems - that imposed limits on program style, content, production qualities, language, etc. In 2003 - 2006 TRECVID supported experiments in automatic segmentation, indexing, and content-based retrieval of digital video using broadcast news in English, Arabic, and Chinese. In 2007 - 2009 TRECVID provided participants with cultural, news magazine, documentary, and education programming supplied by the Netherlands Institute for Sound and Vision. Tasks using this video included segmentation, search, feature extraction, and copy detection. Surveillance event detection was evaluated using airport surveillance video provided by the UK Home Office.
In 2010 TRECVID confronted known-item search and semantic indexing systems with a new set of Internet videos (referred to in what follows as IACC) characterized by a high degree of diversity in creator, content, style, production qualities, original collection device/encoding, language, etc - as is common in much "Web video". The collection also has associated keywords and descriptions provided by the video donor. The videos are available under Creative Commons licenses from the Internet Archive. The only selection criteria imposed by TRECVID beyond the Creative Commons licensing is one of video duration - they are short (less than 6 min). In addition to the IACC data set, NIST began developing an Internet multimedia test collection (HAVIC) with the Linguistic Data Consortium and used it in growing amounts (up to 8000 h) in TRECVID 2010-2017 Multimedia Event Detection (MED) task. The airport surveillance video, introduced in TRECVID 2009, has been reused each year up to 2017 within the Surveillance event detection (SED) task.
New in 2013 was video provided by the BBC. Programming from their long-running EastEnders series was used in the instance search (INS) task. An additional 600 h of Internet Archive video available under Creative Commons licensing for research (IACC.2) was used for the semantic indexing task as planned from 2013 to 2015 with new test data each year. In addition, a new concept localization (LOC) task was introduced in 2013 up to 2016.
In 2015 a new Video Hyperlinking task (LNK) previously run in MediaEval was added up to 2017 and updated in 2018 to address social media storytelling linking.
From 2016 to 2018 the Ad-hoc Video Search (AVS) task used a new IACC.3 dataset (600 hr) with max duration of 9 min, while a new pilot "Video to Text" (VTT) description task was introduced in 2016 to address matching and describing videos using textual descriptions.
Finally, in 2018 a new video activity detection (ActEV) task was introduced as an extension to the SED task and a new joint task, Streaming Multimedia Knowledge Base Population task (SMKBP), with the Text Analysis Conference(TAC) workshop was introduced as well.
Many resources created by NIST and the TRECVID community are available for continued research on past datasets independent of TRECVID. See the Datasets and Resources section of the TRECVID website for pointers.
In TRECVID 2019, 4 tasks (AVS, INS, VTT, ActEV) will contiue with some revisions, while SMKBP task will be open for joining once TAC announce it.
News magazine, science news, news reports, documentaries, educational programming, and archival video
Airport Security Cameras & Activity Detection
Video collections from News, Sound & Vision, Internet Archive,
Social Media, BBC Eastenders