CVEU Workshop

2021-10-17 08:00 Pacific Time

Live Stream Playback at:

ICCV 2021 Workshop on AI for Creative Video Editing and Understanding.

About CVEU

This workshop is for the 1st installment of the AI for Creative Video Editing and Understanding (CVEU).

The workshop brings together researchers working on computer vision, machine learning, computer graphics, Human Computer Interaction, and cognitive research.

It aims to bring awareness of recent advances in machine learning technologies to enable assisted creative-video creation and understanding.


Discuss recent advances in video understanding in the realm of video creation and editing. Some of the topics that we plan to discuss are:

  • Would AI kill video editing jobs?
  • Can AI make humans reduce the cost of video productions?
  • Are there any biases and threads by using AI in this field?

  • Keynote Speakers

    Angjoo Kanazawa
    UC Berkeley; Google
    Irfan Essa
    Georgia Tech; Google
    James E. Cutting
    Cornell University
    Maneesh Agrawala
    Stanford University
    Marc Christie
    University of Rennes 1; IRISA/INRIA

    Industry Spotlights

    Detailed program:

    Schedule Pacific Time
    2021-10-17 08:00 AM

    Introduction   08:00 AM - 08:30 AM -
    Keynote I (by Prof. James E. Cutting)  
    The Event Structure of Popular Movies
    08:30 AM - 09:15 AM -
    Keynote II (by Prof. Marc Christie)  
    Towards Computational Cinematography: what's left and what right?
    09:15 AM - 10:00 AM -
    Break 10:00 AM - 10:15 AM -
    Keynote III (by Prof. Irfan Essa)  
    AI for Video Creation
    10:15 AM - 11:00 AM -
    Panel Discussion   11:00 AM - 12:00 PM -
    Lunch Break 12:00 PM - 13:00 PM -
    Keynote IV (by Prof. Angjoo Kanazawa)  
    Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image
    13:00 PM - 13:45 PM -
    Keynote V (by Prof. Maneesh Agrawala)  
    Making (and Breaking) Video
    13:45 PM - 14:30 PM -
    Break 14:30 PM - 14:45 PM -
    Invited Works I + QA   14:45 PM - 15:30 PM -
    Invited Works II + QA   15:30 PM - 16:15 PM -
    Industry Spotlight I (by Joon-Young - Adobe)  
    Video Segmentation for Video Editing
    16:15 PM - 16:30 PM -
    Industry Spotlight II (by Anastasis - RunwayML)  
    Building Human-in-the-Loop Machine Learning Tools for Video Editing
    16:30 PM - 16:45 PM -
    Industry Spotlight III (by Synopsis)  
    CinemaNet: Building Better Cinematic Workflows with Creative Metadata
    16:45 PM - 17:00 PM -
    Industry Spotlight VI (by Fernando Amat Gil - Netflix)  
    Can Machine Learning Assist in Making Better Trailers?
    17:00 PM - 17:15 PM -
    Industry Spotlight V (by Xintao Wang - Tencent ARC)  
    Tencent ARC: The Wonderland of Video Editing and Creation Algorithms
    17:15 PM - 17:30 PM -
    Closing Remarks   17:30 PM - 17:45 PM -

    Accepted Papers & Invited Works

    Title Speaker Resource
    [14:45 PM - 14:55 PM] VLG-Net: Video-Language Graph Matching Network for Video Grounding Mattia Soldan Paper   Video(YouTube)  Video(Bilibili)
    [14:55 PM - 15:00 PM] Video Transformer Network Daniel Neimark Paper   Supp   Video(YouTube)  Video(Bilibili)
    [15:00 PM - 15:05 PM] TSP: Temporally-Sensitive Pretraining of Video Encoders for Localization Tasks Humam Alwassel Paper  Supp  Video(YouTube)  Video(Bilibili)
    [15:05 PM - 15:10 PM] Face, Body, Voice: Video Person-Clustering with Multiple Modalities Andrew Brown Paper  Video(YouTube)  Video(Bilibili)
    [15:10 PM - 15:15 PM] Video Contrastive Learning with Global Context Haofei Kuang Paper  Video(YouTube)  Video(Bilibili)
    [15:15 PM - 15:20 PM] Plots to Previews: Towards Automatic Movie Preview Retrieval using Publicly Available Meta-data Bhagyashree Gaikwad Paper  Supp  Video(YouTube)  Video(Bilibili)
    [15:20 PM - 15:25 PM] Learning Where to Cut from Edited Videos Yuzhong Huang Paper   Video(YouTube)  Video(Bilibili)
    [15:25 PM - 15:30 PM] QA
    [15:30 PM - 15:42 PM] Paint Transformer: Feed Forward Neural Painting with Stroke Prediction Songhua Liu Paper   Video(YouTube)  Video(Bilibili)
    [15:42 PM - 15:47 PM] Boundary-sensitive Pre-training for Temporal Localization in Videos Mengmeng Xu Paper   Video(YouTube)  Video(Bilibili)
    [15:47 PM - 15:52 PM] Editing like Humans: A Contextual, Multimodal Framework for Automated Video Editing Patrick Adelman Paper  Video(YouTube)  Video(Bilibili)
    [15:52 PM - 15:57 PM] ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency Wenhao Wu Paper  Video(YouTube)  Video(Bilibili)
    [15:57 PM - 16:02 PM] AniVid: A Novel Anime Video Dataset with Applications in Animation Kai E Gangi Paper  Video(YouTube)  Video(Bilibili)
    [16:02 PM - 16:06 PM] High-Level Features for Movie Style Understanding Robin Courant Paper  Supp  Video(YouTube)  Video(Bilibili)
    [16:06 PM - 16:10 PM] Re-enacting video shots with fictional characters Joanna Materzynska Paper  Video(YouTube)  Video(Bilibili)
    [16:10 PM - 16:15 PM] QA


    $1,000 USD for best paper awards!

    Call for Papers


    Fabian Caba Heilbron
    Adobe Research
    Yu Xiong
    The Chinese University of Hong Kong
    Anyi Rao
    The Chinese University of Hong Kong
    Qingqiu Huang
    The Chinese University of Hong Kong
    Ali Thabet
    Victor Escorcia
    Dong Liu
    Dahua lin
    The Chinese University of Hong Kong