Medical Video Question Answering (MedVidQA)

Task Coordinators: Deepak Gupta, and Dina Demner-Fushman

The recent surge in the availability of online videos has changed the way of acquiring information and knowledge. Many people prefer instructional videos to teach or learn how to accomplish a particular task with a series of step-by-step procedures in an effective and efficient manner. In a similar way, medical instructional videos are more suitable and beneficial for delivering key information through visual and verbal communication to consumers' healthcare questions that demand instruction. With an aim to provide visual instructional answers to consumers' first aid, medical emergency, and medical educational questions, this TRECVID task on medical video question answering will introduce a new challenge to foster research toward designing systems that can understand medical videos to provide visual answers to natural language questions and equipped with the multimodal capability to generate instructional questions from the medical video. Following the success of the 1st MedVidQA shared task in the BioNLP workshop at ACL 2022, MedVidQA 2023 at TRECVID expanded the tasks and introduced a new track considering language-video understanding and generation. This track is comprised of two main tasks Video Corpus Visual Answer Localization (VCVAL) and Medical Instructional Question Generation (MIQG).

Task Guidelines

For more detailed information about the tasks, dataset, and evaluation framework, please refer to the detailed task guidelines page found HERE

Digital Video Retrieval at NIST

Digital Video Retrieval at NIST
News magazine, science news, news reports, documentaries, educational programming, and archival video

Digital Video Retrieval at NIST
TV Episodes

Digital Video Retrieval at NIST
Airport Security Cameras & Activity Detection

Digital Video Retrieval at NIST
Video collections from News, Sound & Vision, Internet Archive,
Social Media, BBC Eastenders