Movie Summarization (MSUM)

Task Coordinators: Keith Curtis, Asad Butt, and George Awad

An important need in many situations involving video collections (archive video search/reuse, personal video organization/search, movies, tv shows, etc.) is to summarize the video in order to reduce the size and concentrate the amount of high value information in the video track. In 2022 we begin the Movie Summarization (MSUM) track in TRECVID, replacing the previous Video Summarization (VSUM) track. This track will use a licensed movie dataset from Kinolorberedu, in which the goal is to summarize the storylines and roles of specific characters during a full movie.

The goals for this track are to:

Efficiently capture important facts about certain persons during their role in the movie storyline.
Assess how well video summarization and textual summarization compare in this domain.

This track is comprised of two main tasks:

Video Summary

Given a movie, a character, and image / video examples of that character, Generate a video summary highlighting major key-fact events about the character (similar to TV20 & TV21 VSUM). Video summaries will be limited by a maximum summary length. See below for further details on what constitutes a key-fact event and for details on annotation and assessment.

Text Summary

Given a movie, a character, and image / video examples of that character, Generate a textual summary to include key-fact events about the character role in the movie. Textual summaries will be limited by a maximum number of sentences and a maximum number of words. See below for further details on what constitutes a key-fact event and for details on annotation and assessment.

Annotation and Assessment

Human annotators will:

Watch each movie
For selected characters, extract key-fact events about them

Video Summary evaluation:

Assessors will watch submitted summaries (subject to max duration)
Systems are rewarded for including the key-fact events
Scoring is based on the percentage of correct key-facts includerd in the summaries
Subjective evaluation will also be conducted (contextuality, redundancy, etc.)

Textual Summary evaluation:

Systems will submit a summary of up to X sentences and Y words
Assessors will read the submitted textual summary and mark correctly retrieved key-facts
Objective evaluation of retrieved key-facts (regardless of any filler sentences)
Subjective evaluation will also be conducted (readability, contextuality, redundancy, etc.)

What is a key-fact event?

Any events that are important and critical in the character storyline.
They should cover his/her role from the start to the end of the movie.
Example : From the example movie “Super Hero” (below) – Character: Jeremy

Charlie bullies Jeremy
Charlie and Jeremy fight at the playground
Jeremy's mother reveals to the principle that Jeremy has a terminal illness
Jeremy gets admitted to the hospital
Jeremy passes away

Important points

A key-fact event regarding a charcter does not necessarily require that character to be visible in the scene. In the above example 'Super Hero', Jeremy's mother revealed to the principle that Jeremy had a terminal illness. This would clearly count as key-fact regarding Jeremy even though he was not present in the scene.

The purpose of this task is to summarize the important key-facts for a character. As such, this is different from a movie trailer. Key events should appear in the order in which they become apparent in the movie, and should ideally capture that characters storyline.

The number of allowed key facts is limited per movie and character. One of the major challenges of the task is to seperate major key facts from non consequential things. One example could be: 'Daryl broke up with his girlfriend over breakfast' is more likely to be a major key fact than 'Daryl had eggs and toast for breakfast'.

Data Resources

Dataset
This track will use a licensed movie dataset from Kinolorberedu. For the current year of the track, 10 full movies will be made available to participating teams.
To access the training and testing dataset from HERE, please submit the data agreement form to [email protected]

Topics (Characters to Summarize):
Each topic will consist of a movie, the character to summarise the key-fact events for, and a set of image/video examples of that character. For video summaries, a max summary time (in seconds) will be specified for each character. While for text summaires, the max sentences limit will be specified for each character as well. A sentence for text summary can be either a keyfact (the focus of the task), or a filler sentence. The max sentences a run can submit for a given character will include all keyfacts and filler sentences.
Sharing of components:
- Docker image tools for development are available here. Contact the author Robert Manthey if you have questions using them.
- We encourage teams to share development resources with other active participants to expedite system development.

Important Dates:

Please check the TRECVID 2022 schedule for important dates.

Run submission format:

Participants will submit results against the Kino Lorber movie dataset in each run for all and only the characters chosen for the summarization task that year, using the movies specified by NIST.
Teams may submit up to 4 prioritized runs per task submission (1 - 4).
Text Submissions will comprise of the final automatically generated text summary for each topic, in an xml container, as below, fully decribing the run submissions.
Video Submissions will comprise of the final automatically generated video summary for each topic, in addition to an xml container, as below, fully decribing the run submissions.
All submitted summaries must be named <TEAM_NAME>_<MOVIE_NAME>_<RUN_number>_<Text|Video>.xml or <TEAM_NAME>_<MOVIE_NAME>_<TARGET_NAME>_<RUN_number>_<Video>.mp4
For example, team SiriusCyberCo, submitting their text summaries for each target character, for the movie SuperHero, for their first run, must name their submission: SiriusCyberCo_SuperHero_1_Text.xml.
SiriusCyberCo, submitting their video summaries for target character Jeremy, for the movie SuperHero, for their second run, must name their submissions: SiriusCyberCo_SuperHero_2_Video.xml and SiriusCyberCo_SuperHero_Jeremy_2_Video.mp4
Please note: Only submissions which are valid when checked against the supplied DTDs will be accepted. You must check your submission before submitting it. NIST reserves the right to reject any submission which does not parse correctly against the provided DTD(s). Various checkers exist, e.g., Xerces-J: java sax.Counter -v YourSubmision.xml.
Here for download (right click and choose "display page source" to see the entire file) is the DTD for text summarization results of one run and a small example of what a site would send to NIST for evaluation. Please check your submission to see that it is well-formed
Here for download (right click and choose "display page source" to see the entire file) is the DTD for summarization results of one run and a small example of the xml file that a site would send to NIST for evaluation. Please check your submission to see that it is well-formed
Please submit each run information in a separate file, named to make clear which team it is from. EACH file you submit should begin, as in the example submission, with the DOCTYPE statement and a MovieSummarizationTextResults or MovieSummarizationVideoResults element even if only one run is included: