CVEU Workshop

2023-10-02, Full-day, Paris, France (Room E8 at ICCV)
YouTube Playback Interactive 360 View: https://youtu.be/ukqTcxPt1zI

ICCV 2023 Workshop on AI for
Creative Video Editing and Understanding

About CVEU

This workshop is for the 3rd installment of the AI for Creative Video Editing and Understanding (CVEU), which follows its success on the previous launch at ECCV2022 and ICCV2021.

The workshop brings together researchers, artists and entrepreneurs working on computer vision, machine learning, computer graphics, human computer interaction, and cognitive research.

It aims to bring awareness of recent advances in machine learning technologies to enable assisted creative-video creation and understanding.

Program

Discuss recent advances in creative video understanding, creation and editing. Some of the topics that we plan to discuss are:

  • Can AI reduce the human cost of video productions?
  • How to design intuitive video editing tools?
  • How do humans and AI collaborate together to inspire creativity?
  • Are there any biases and threats by using AI in this field?

  • A special film session showcasing submissions to AI ShortFest, exciting new creative Film Festival. Get a sneak peek of this year's trailer:

    Keynote Speakers

    Marc Christie
    Assistant Professor,
    INRIA Mimetic Team,
    University of Rennes
    Maneesh Agrawala
    Professor & Director,
    Brown Institute for Media Innovation,
    Stanford University
    Ivan Laptev
    Professor,
    Program Chair of ICCV'23,
    MBZUAI
    Anna Giralt Gris
    Filmmaker,
    New Media Creator,
    Researcher
    Jorge Caballero
    Director, Screenplay,,
    Producer, Edition,
    Executive Producer
    Hugo Caselles Dupré
    Co-Founder,
    Obvious Art
    Yogesh Balaji
    Senior Research Scientist,
    (Author of eDiff-I)
    NVIDIA
    Kfir Aberman
    Research Team Lead,
    (Author of Dreambooth)
    Snap Research

    Detailed Program

    Schedule Paris Time
    2022-10-02 8:30 AM

    Warm-up Session  08:30 AM - 09:00 AM -
    Open Remarks and Organizers' Spotlight  09:00 AM - 09:20 AM -
    Academia Keynote I by Marc Christie
    Understanding Style in Movies
    09:20 AM - 09:45 AM -
    Academia Keynote II by Ivan Laptev
    Video Understanding in the Era of Large Language Model
    09:45 AM - 10:10 AM -
    Academia Keynote III by Maneesh Agrawala
    Unpredictable Black Boxes are Terrible Interfaces
    10:10 AM - 10:35 AM -
    Artistic Keynote I by Jorge Caballero Ramos and Anna Giralt Gris
    Does AI Cinema Truly Exist?
    10:35 AM - 11:00 AM -
    Roundtable Discussion 11:00 AM - 11:40 AM -
    Poster Session & Launch Break 11:40 AM - 01:45 PM -
    Film/Art Session 01:45 PM - 02:50 PM -
    Oral Paper Presentation
       2: Is there progress in activity progress prediction?
       8: PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
       9: Expressive Talking Head Video Encoding in StyleGAN2 Latent Space Best Paper Award
       27: Enhancing Text-to-Video Editing with Motion Map Injection
       34: LUSE: Using LLMs for Unsupervised Step Extraction in Instructional Videos
    02:50 PM - 03:30 PM -
    Coffer break 03:30 PM - 04:00 PM -
    Industry Keynote I by Yogesh Balaji
    Lights, Camera, Diffusion: Video Content Creation with Diffusion Models
    04:05 PM - 04:20 PM -
    Industry Keynote II by Kfir Aberman
    Generating Personalized Content with Text-to-Image Diffusion Models
    04:20 PM - 04:35 PM -
    Artistic Keynote II by Hugo Caselles-Dupré
    Obvious: Bridging Art and Research through artificial Intelligence
    04:35 PM - 04:50 PM -
    Closing Remarks 04:50 PM - 05:05 PM -

    Participants from



    Sponsors



    Papers Presentation

    In-proceeding Track Resources
    Is there progress in activity progress prediction? Paper
    Are current long-term video understanding datasets long-term? Paper
    VAST: Vivify Your Talking Avatar via Zero-Shot Expressive Facial Style Transfer Paper
    PAT: Position-Aware Transformer for Dense Multi-Label Action Detection Paper
    Expressive Talking Head Video Encoding in StyleGAN2 Latent Space Paper
    Benchmarking Data Efficiency and Computational Efficiency of Temporal Action Localization Models Paper
    InFusion: Inject and Attention Fusion for Multi Concept Zero Shot Text based Video Editing Paper
    LEMMS: Label Estimation of Multi-feature Movie Segments Paper
    Extended Abstract Track Resources
    Dubbing for Extras: High-Quality Neural Rendering for Data Sparse Visual Dubbing Paper
    Emotionally Enhanced Talking Face Generation Paper
    Learning and Verification of Task Structure in Instructional Videos Paper
    Enhancing Text-to-Video Editing with Motion Map Injection Paper
    Can we predict the Most Replayed data of video streaming platforms? Paper
    EVA-VOS: Efficient Video Annotation for Video Object Segmentation Paper
    Representation Learning of Next Shot Selection for Vlog Editing Paper
    Text-Based Video Generation With Human Motion and Controllable Camera Paper
    Knowledge-Guided Short-Context Action Anticipation in Human-Centric Videos Paper
    LUSE: Using LLMs for Unsupervised Step Extraction in Instructional Videos Paper


    Outstanding Reviewers

    Dawit Mureja (KAIST), Jiaju Ma (Stanford University), Liming Jiang (NTU), Marc Christie (INRIA), Mattia Soldan (KAUST), Max Bain (Oxford), Sharon Zhang (Stanford University), Yixuan Li (CUHK), Yue Zhao (UT Austin), Yunzhi Zhang (Stanford University), Ziqi Huang (NTU)



    AI ShortFest Awards

    Best Short Award: Kiss Crash, Adam Cole
    Frontier Award: Idle Hands, Dr Formalyst & Irina Angles
    Viewer’s Award: Ossature, Derek Bransombe

    Call for Submission

    Organizers

    Anyi Rao
    Stanford University
    Fabian Caba Heilbron
    Adobe Research
    Linning Xu
    CUHK
    Jean-Peïc Chou
    Stanford University
    Yuwei Guo
    CUHK
    Yu Xiong
    CUHK
    Ali Thabet
    Meta
    Victor Escorcia
    Samsung
    Dong Liu
    Netflix
    Dahua lin
    CUHK
    Maneesh Agrawala
    Stanford University