The video_to_text (VTT) task testing dataset consists of the following folders:

1- description.generation.subtask:
   - Txt file: testing.URLs.video.description.subtask:
     This file has 2 columns: the first column is the Id of the URL, and the second column the URL.
   - You will have to download all URLs to work on the task and submit results using the URLs Ids.

2- matching.ranking.subtask:
   - Each folder is a separate testing subset with varying number of URLs and matching description files:
     - testing.2.subsets folder includes 3 files: 
       tv17.vtt.url.list
       tv17.vtt.descriptions.A
       tv17.vtt.descriptions.B
     - testing.3.subsets folder includes 4 files: 
       tv17.vtt.url.list
       tv17.vtt.descriptions.A
       tv17.vtt.descriptions.B
       tv17.vtt.descriptions.C
     - testing.4.subsets folder includes 5 files: 
       tv17.vtt.url.list
       tv17.vtt.descriptions.A
       tv17.vtt.descriptions.B
       tv17.vtt.descriptions.C
       tv17.vtt.descriptions.D
     - testing.5.subsets folder includes 6 files: 
       tv17.vtt.url.list
       tv17.vtt.descriptions.A
       tv17.vtt.descriptions.B
       tv17.vtt.descriptions.C
       tv17.vtt.descriptions.D
       tv17.vtt.descriptions.E

   In each of the above files, the first column represents the Id of a vine URL or a text description.
   The task is to match the URLs to their descriptions in each of the description files.
   
   ***You only need to download the URLs in the Description Generation Subtask***, since it is the most comprehensive list.
   All other URL files in the Matching and Ranking Subtask use a subset of these URLs. The URL IDs are consistent between files.
   This means that a particular URL ID will refer to the same URL across all files. 

   Each set (2,3,4,5) SHOULD be treated independently. The goal of those subsets is to measure the performance 
   on varying size of testing subsets. Subset 2 includes 1613 URLs, subset 3 includes 795 URLs, subset 4 includes 388 URLs, 
   and subset 5 includes 159 URLs.

To submit results for the subtask of "Matching and Ranking":
-----------------------------------------------------------

For each testing subset: return for each video URL a ranked list of the most likely text 
description that correspond (was annotated) to the video from 
each of the sets A, B, C, ...etc. Please use the following format in your 
submission files:

rank URL_ID TextDescription_ID

where:
rank - Is an integer number represents the likelihood that the "textual description" represented by TextDescription_ID 
       taken from tv17.vtt.description.A,B,..etc most likely describes the video URL. Please be careful to use the correct 
       corresponding descriptions for each testing subset independently from other subsets.
       (the lower rank numbers, the higher the confidence/rank).

URL_ID - Is the URL id taken from the first column in the file tv17.vtt.url.list (for each testing subset)

TextDescription_ID - Is the textual description id taken from the first column in the files tv17.vtt.description.A,B,..etc

Please submit different run files for each of the testing subsets AND textual description sets A, B, C, ...etc within them.

Example of a snippet from a run file:
1 1 367
2 1 78
3 1 1289
.
.
1 2 278
2 2 902
.
.
1915 1915 10
       
To submit results for the subtask of "Description Generation":
-------------------------------------------------------------

Automatically generate for each video URL a text description (1 sentence) 
independently and without taking into consideration the existence of any descriptions in the matching and ranking subtask.

Please use the following format in your submission files:

URL_ID TextDescription

Where:
URL_ID - Is the URL id taken from the first column in the file testing.URLs.video.description.subtask

TextDescription - Is the system generated 1 sentence text description.

Example of a snippet from a run file:
10 a man and a woman riding in a car at	night


Notes:
- For each testing subset in the "Matching and Ranking" subtask, systems are allowed to submit up to 4 runs for each description set (A, B, etc) 
  and 4 runs in the Description Generation subtask.
- Please use the string "set.2.", "set.3.", etc as part of your run file names to differentiate between different testing subsets in the "Matching and Ranking" subtask.
- Please use the strings ".A.", ".B.", etc as part of your run file names to differentiate between different description sets run files in the "Matching and Ranking" subtask..
- A run should include results for all the testing video URLS (no missing video URL_ID will be allowed).
- No duplicate result pairs of <rank> AND <URL_ID> are allowed (please submit only 1 unique set of ranks per URL_ID).
- All automatic text descriptions should be in English.