An important need in many situations involving video collections (archive video search/reuse, personal video organization/search, movies, tv shows, etc.)
is to summarize the video in order to reduce the size and concentrate the amount of high value information in the video track. In 2021 we continue the video summarization track in TRECVID in which the task is to summarize the major life events of specific characters over a number of weeks of programming on the BBC Eastenders TV series. Typically, five characters will be chosen for this task every year, and summaries of their major life events must be between the selected period of the show, which will be specified to participants in advance of the task.
The use case for this task is to generate an automatic summary, using a predefined maximum number
of unique shots, of the significant life events of a given character from the Eastenders series over a given
number of episodes. The generated summaries should be enough to gain a clear and concise overview of
that characters major life events over the course of 8 - 12 weeks of programming in the series, and to
see how they intertwine with the major life events of other specified characters in that time frame of the
series.
System Task
Given a collection of BBC Eastenders test videos, a master shot boundary reference, a list of characters from the series, and a time frame of the series for which to use
for summarization, summarize the major life events of each character within the specified time frame of the series. Some examples of major life events are more likely to be: The birth of a child rather than a short
illness, A divorce rather than an argument with a loved one, the passing of a loved one rather than the passing of someone losely known to you, etc., etc. Summaries are
limited to a maximum number of unique shots, thus the main challenge is to select those shots most likely to be considered a major life event by human assessors.
In 2021 we will continue the video summarization task started in 2020:
1- Main Task: Questions Unknown
Systems will be asked to submit automatically generated summaries for five specified characters of the Eastenders series:
Time period limited to between 8 and 12 weeks of the series.
Videos of the series which can be used for summarization will be specified.
Maximum number of shots which can be used in summaries will be specified.
Ground truth from the 2020 task will be made available for training
systems in 2021: here.
For this main task the 5 content questions will not be known in advance.
2- Subtask: Questions Known
This subtask will be run as with the main task, except that content quetions will be made known to teams in advance.
Submission dates for this subtask will be later than the main task,
and questions will only be made known to teams once the task submision
deadline for the main task has passed.
Sample Summarization Case: Heather
In the example of summarizing the major life events of Heather, the following is an example of the kind of questions likely to be asked to human assessors as they rate the quality of summaries,
followed by an example of the video clips which would answer those questions. Note that the answer does not have to be specifically stated in the videos, just that they can be said to answer those questions.
300 GB, 464 h of the BBC Eastenders test data will be available from Dublin City University.
Auxiliary data: Participants are allowed and encouraged to use various publicly available EastEnders resources as long as they carefully note the use
of each such resource by name in their workshop notebook papers. They are strongly encouraged to share information about the existence
of such resources with other participants via the active participants mailing list as soon as they discover them.
Topics (Characters to Summarize):
Each topic will consist of a set of 4 example frame images (bmp) drawn from test videos containing the person of interest
in a variety of different appearances to the extent possible.
For each frame image (of a target person) there will be a binary mask of the region of interest (ROI), as bounded by a single polygon and the ID from the
master shot reference of the shot from which the image example was taken. In creating the masks (in place of a real searcher), we will
assume the searcher wants to keep the process simple. So, the ROI may contain non-target pixels, e.g., non-target regions visible through
the target or occluding regions. In addition to example images of the person of interest, the shot videos from which the images were taken
will also be given as video examples.
Sharing of components:
Docker image tools for development are available here.
Contact the author Robert Manthey if you have questions using them.
We encourage teams to share development resources with other active participants to expedite system development.
Participants will submit results against BBC Eastenders dataset in each run for all and only the 5 main characters chosen for the summarization task that year, within the time frame specified by NIST.
Each team is asked to submit 4 prioritized runs per task submission.
Submissions will comprise of the final automatically generated video summary for each topic, in .mp4 format, in addition to the xml container, as below, fully decribing the run submissions.
Video summaries must be named <TEAM_NAME>_<TASK_Number>_<RUN_Number>_<TOPIC>.mp4
For example, team SiriusCyberCo, submitting their second
run, on the main task of unknown questions, for topic Heather, must name their submission: SiriusCyberCo_1_2_Heather.mp4 SiriusCyberCo, submitting their fourth
run, on the subtask of known questions, for topic Heather, must name their submission: SiriusCyberCo_2_4_Heather.mp4
Please note: Only submissions which are valid when checked against the supplied DTDs will be accepted. You must check your submission
before submitting it. NIST reserves the right to reject any submission which does not parse correctly against the provided DTD(s). Various
checkers exist, e.g., Xerces-J: java sax.Counter -v YourSubmision.xml.
Here for download (right click and choose "display page source" to see the entire file) is the
DTD for summarization results of one run and a small example of what a site would send to NIST for evaluation.
Please check your submission to see that it is well-formed
Please submit each run information in a separate file, named to make clear which team it is from. EACH file you submit should begin, as in the example
submission, with the DOCTYPE statement and a videoSummarizationResults element even if only one run is included:
VSUM java Run Checker can be found at
VSUM Active Directory. Please check files before submission.
Queries:
The following table specifies this years query characters, the time frame of the series (Start Shot # and End Shot #), links to images of the query characters, and the maximum length and number of shots for each run.
Important: All participating teams should submit 4 runs for each query, using the specified maximum number of shots for each run.
What happens when police break in the door of Jack and Tanya's home?
Where are Max and Jack during the voilent confrontation between them when a gun is drawn?
Who does Jack offer to pay in order to withdraw their statement to the police?
Why is Jack a suspect in the hit and run on Max?
What does Jack reveal to Tanya about his dodgy past?
Max:
What were the cause of Max's serious injuries which left him in hospital?
What is/was the relationship between Max and Tanya?
What kind of weapon does Max obtain from Phil?
Where are Max and Jack during the voilent confrontation between them when a gun is drawn?
Who is responsible, or who does Max believe is responsible, for the serious injuries which left him in hospital?
Tanya:
What does Tanya reveal to the police while being interviewed at the station?
What is/was the relationship between Max and Tanya?
What does Jack reveal to Tanya about his dodgy past?
What does Tanya discover in the sink and on Jack's clothes?
What big move were Tanya and Jack planning for the future?
Archie:
What happens when Phil throws Archie in to a pit?
What happens after Danielle reveals to Archie that Ronnie is her mother?
Where do Peggy and Archie get married?
What happens when Archie arrives at the pub after Peggy invited him?
What happens when Archie is kidnapped?
Peggy:
Who does Peggy ask to kill Archie?
Where do Peggy and Archie get married?
Show one of the challenges which Peggy faces in her election run.
What does Peggy overhear Archie saying, which causes their marriage to be over?
What is Janine doing to irritate or anger Peggy?
Evaluation:
In 2021, all submitted video summaries will be evaluated by assessors at Dublin City University.
A set of questions for each summary will be diseminated to assessors, but not to participants, for evaluation of summary content.
Summaries are also evaluated according to tempo, contextuality, and redundancy of generated video summaries:
Estimate the Tempo and Rhythm of this video summary, on a Likert scale of 1 - 7. High is best. Tempo/Rhythm Defined as:How well do the video shots flow together? Do shots cut mid-sentence (indicating poor tempo/rhythm)? Do they flow together nicely so it wouldn't be obvious that this is an automatically generated summary (high tempo/rhythm)?
Estimate the Contextuality provided by this video summary, on a Likert scale of 1 - 7. High is best. Contextuality Defined as:Does the content provide the circumstances that form the setting for an event, statement, or idea, and in terms of which it can be fully understood and assessed? (High is best)
Estimate the level of Redundancy in this video summary, on a Likert scale of 1 - 7. Low is best. Redundancy Defined as:Does the video contain content considered to be unnecessary or superfluous? (Low is best)
Measures:
Scoring measures for summaries will be calculated from the content based questions and also from the tempo, contextuality, and redundancy based Likert scale estimates described above.
Important notes
The BBC requires all VSUM task participants to fill, sign and submit a renewal data License agreement in order to use the Eastenders data.
That means that even if a past participant has a copy of the data, the team must submit a renewal License form before any submission runs can be accepted and evaluated.
No human preknowledge to the closed world of the Eastenders dataset is allowed to be used to filter content. Any filteration methods should all be automatic without fine tuning based on the Eastenders dataset human knowledge.
The usage of the included xml transcripts' files are limited to only the transcripted text and not to any other metadata (or xml) attributes (e.g. color of text, etc).
Open Issues:
BBC Eastenders data License is now available from the BBC [RESOLVED]
News magazine, science news, news reports, documentaries, educational programming, and archival video
TV Episodes
Airport Security Cameras & Activity Detection
Video collections from News, Sound & Vision, Internet Archive, Social Media, BBC Eastenders