In 2008 the TREC Video Retrieval Workshop series will run a 1-day workshop on video summarization of unedited BBC video (rushes) as part of the ACM Multimedia Conference 2008, Vancouver, BC, Canada on Friday, 31. October 2008.
PLEASE NOTE: Use of the BBC data and presentations at the ACM workshop (oral, poster, demo) will be limited to groups who are already active participants in TRECVID 2008 and have completed submissions for the TRECVID video summarization task. But attendance at the ACM workshop and participation in discussions will be open to all who sign up for the ACM workshop.
Compact visual surrogates for videos, ones that give a good indication of what the videos are about, have many potential uses even when generic. The highly redundant nature of rushes and their potential value for reuse and repurposing make them a good target for summarization. But such summarization is difficult and evaluation of summaries, whether intrinsic or extrinsic, is known to be complicated and costly. This ACM workshop is a follow-on to the 2007 rushes summarization workshop but with an improved evaluation plan, a tougher goal (2% summary rather than 4%), and more experimental freedom - an opportunity to see if what seemed to work in 2007 works again on new data and to improve or replace what didn't.
Rushes are the raw material (extra video, B-rolls footage) used to produce a video. 20 to 40 times as much material may be shot as actually becomes part of the finished product. Rushes usually have only natural sound. Actors are only sometimes present. So very little if any information is encoded in speech. Rushes contain many frames or sequences of frames that are highly repetitive, e.g., many takes of the same scene redone due to errors (e.g. an actor gets his lines wrong, a plane flies over, etc.), long segments in which the camera is fixed on a given scene or barely moving,etc. A significant part of the material might qualify as stock footage - reusable shots of people, objects, events, locations, etc. Rushes may share some characteristics with "ground reconnaissance" video.
The BBC Archive has provided about 100 hours of unedited material in MPEG-1 from about five dramatic series. Most of the videos have durations of about 30 minutes. Half the videos will be used for systems development and half reserved for system test.
Ground truth was created at NIST by the same people that judged the video summaries for the 2007 workshop. In 2008, judging will take place at Dublin City University using almost exactly the same procedure and software that was used in 2007.
The system task in rushes summarization will be, given a video from the rushes test collection, to automatically create an MPEG-1 summary clip less than or equal to 2% of the original video's duration. This means the average summary will be less than or equal to 30 seconds long. The summary should show the main objects (animate and inanimate) and events in the rushes video to be summarized. The summary should minimize the number of frames used and present the information in ways that maximizes the usability of the summary and speed of objects/event recognition.
Such a summary could be returned with each video found by a video search engine much as text search engines return short lists of keywords (in context) for each document found - to help the searcher (whether professional of recreational) decide whether to explore a given item further without viewing the whole item. It might be input to a larger system for filtering, exploring and managing rushes data.
Although in this pilot task we limit the notion of visual summary to a single clip that will be evaluated using simple play and pause controls, there is still room for creativity in generating the summary. Summaries need not be series of frames taken directly from the video to be summarized and presented in the same order. Summaries can contain picture-in-picture, split screens, and results of other techniques for organizing the summary. Such approaches will raise interesting questions of usability.
A run is the collected result of a summarization system's execution on all of the test videos, i.e., it contains one MPEG-1 summary clip for each of the test rushes videos and the system time (in seconds) needed to create the summary starting only with the video to be summarized. In 2008, each participating group will be allowed to submit up to 2 runs, but we cannot guarantee that there will be time to judge all runs. We ask that groups submit more than one run ONLY if there is good preliminary evidence of a significant difference in performance between the runs.
Carnegie Mellon University will again provide the output of a simple baseline rushes summarization system.
At NIST, all the summary clips for a given video will be viewed in a randomized order by a single human judge. In a timed process, the judge will play, pause, stop, fast forward, rewind the video as needed to determine as quickly as possible which of the objects and events listed in the ground truth for the video to be summarized are present in the summary. The judge may also be asked to assess the usability of the summary. This process will be repeated for each test video.
proceedings are available as part of the ACM Digital Library.
Slides from the workshop presentations are available below.
Here is video of workshop highlights made by Neil O'Hare from Dublin City University
09.00 - 09.40
The TRECVID 2008 BBC Rushes Summarization Evaluation Pilot
Paul Over (NIST), Alan F. Smeaton (DCU), George Awad (NIST)
09.40 - 10.00
Comparison of Content Selection Methods for Skimming Rushes Video
Werner Bailer and Georg Thallinger
10.00 - 10.30 Break
10.30 - 10.50
Brief and High-Interest Video Summary Generation: Evaluating the AT&T Labs Rushes Summarizations
Zhu Liu, Eric Zavesky, Behzad Shahraray, David Gibbon, and Andrea Basso
AT&T Labs Research
10.50 - 11.10
Binary Tree Based On-line Video Summarization
Victor Valdés and José Martinez
Universidad Autónoma de Madrid
11.10 - 11.30
Dublin City University
11.30 - 11.50
Video Rushes Summarization Using Spectral Clustering and Sequence Alignment
Vasileios Chasanis, Aristidis Likas, and Nikolaos Galatsanos
University of Ioannina
11.50 - 12.10
Exploring the Utility of Fast-Forward Surrogates for BBC Rushes
Michael G. Christel, Alexander G. Hauptmann, Wei-Hao Lin, Ming-Yu Chen, Jun Yang, Bryan Maher, and Robert V. Baron
Carnegie Mellon University
12.10 - 12.30
The COST292 experimental framework for RUSHES task in TRECVID 2008
S. U. Naci, Jenny Benois-Pineau, Uros Damnjanovic, Christian Kaes, Boris Mansencal, Marzia Corvaglia
12.30 - 2.00 Lunch in the Cypress Suite
2.00 - 2.20 Boasters (preview of demos/posters)
2.20 - 3.30 Combined demos and posters