Interactive Social Avatars

with the 4th GENEA Gesture Generation Challenge

An ECCV 2026 Workshop on motion-level synthesis, gesture generation, and gesture understanding for interactive virtual humans.
Interactive Social Avatars workshop
ECCV 2026 Workshop · 8–9 September 2026 · Malmö, Sweden

The Interactive Social Avatars workshop brings together researchers working on virtual human agents that can participate in interactive, dyadic conversation by generating coherent, expressive, and contingent multimodal behavior. Unlike video-centric approaches, we focus on motion-level synthesis: generating 3D body pose, gesture, and facial motion in human representations that enable real-time rendering flexibility.

We also host the 4th GENEA Challenge, a community-driven benchmark for speech-driven 3D gesture generation, this year built on the recently released 4000-hour dyadic Seamless Interaction Dataset.


Important Dates

All deadlines are at the end of day, Anywhere on Earth (AoE).

TBD
Paper submission deadline
TBD
Notification of paper acceptance
TBD
Camera-ready deadline
TBD
GENEA Challenge - release of training data
TBD
GENEA Challenge - submission deadline
8–9 September 2026
Workshop day at ECCV 2026, Malmö, Sweden

Call for Papers

We invite original research contributions on interactive social avatars: virtual humans that perceive and generate contingent multimodal behavior in dyadic interaction.

Topics of Interest

Topics of interest include (but are not limited to):

  • 3D human capture and reconstruction in interactive settings
  • Audio-visual-driven face and body animation
  • Co-speech gesture and full-body motion generation
  • Gaze and facial expression modeling
  • Multimodal representations for interaction
  • Streaming and low-latency systems for avatar synthesis
  • Proactive social behavior modeling (anticipatory and intent-driven actions)
  • Gesture understanding: perception, semantic alignment with speech, social cues
  • Datasets, benchmarks, and standardized evaluation protocols for dyadic interaction
  • Identity, fairness, privacy, and safety in avatar-mediated communication

Submission Types

We welcome:

  • Long papers - TBD pages
  • Short papers / extended abstracts - TBD pages

All submissions must follow the ECCV 2026 author guidelines and be prepared for double-blind review.

Submission site: Coming soon.

[More details - page limits, formatting, OpenReview link, proceedings policy - to be added.]


4th GENEA Gesture Generation Challenge

Evaluation remains a significant bottleneck in research on interactive social avatars. To help address this, the workshop hosts the 4th GENEA Challenge, an established community-driven initiative for advancing speech-driven 3D gesture generation.

This year's challenge focuses on the recently released 4000-hour dyadic Seamless Interaction Dataset. Participants submit generated motion on a private test set, and organizers conduct large-scale, crowdsourced human evaluation to create a standardized state-of-the-art benchmark, assessing motion realism and appropriateness for the speech using established protocols, with extensions for dyadic adaptation and semantic alignment.

Anticipated Impact

  1. Foster the first series of generative models built on the Seamless Interaction Dataset.
  2. Provide a robust benchmark for state-of-the-art gesture generation.
  3. Enable the development of automated metrics using collected human preference votes.

Get Involved

Full challenge details, registration, dataset access instructions, evaluation protocol and timeline are on the GENEA Challenge 2026 website.

Visit the GENEA Challenge website

Previous GENEA Challenges


Workshop Programme

Tentative full-day schedule. Times in Malmö local time (CEST, UTC+2).

09:00 – 09:10
Introduction and Opening Remarks
09:10 – 09:50
Keynote I + Q&A
09:50 – 10:30
Keynote II + Q&A
10:30 – 10:40
Coffee Break
10:40 – 11:20
Keynote III + Q&A
11:20 – 11:35
GENEA Challenge: Opening Remarks and Overview
11:35 – 12:20
GENEA Challenge: Spotlight Talks (3 × 10 min) + Q&A
12:20 – 13:00
Lunch Break
13:00 – 13:45
GENEA Challenge: Spotlight Talks (3 × 10 min) + Q&A
13:45 – 14:15
Keynote IV + Q&A
14:15 – 14:25
Coffee Break
14:25 – 15:05
Keynote V + Q&A
15:05 – 16:30
Poster Session
16:30 – 17:30
Panel Discussion

Invited Speakers

Prof. Andrew Zisserman

Andrew Zisserman
Affiliation
University of Oxford, UK
Biography
Andrew Zisserman is a Royal Society Research Professor in the Department of Engineering Science at the University of Oxford, where he co-leads the Visual Geometry Group (VGG). His recent research focuses on video understanding, including recognizing human actions and gestures, movie understanding, and releasing datasets such as Kinetics. He has authored over 800 papers and his work has received best paper and "test of time" awards at international conferences.

Prof. Louis-Philippe Morency

Louis-Philippe Morency
Affiliation
Carnegie Mellon University, Language Technologies Institute, USA
Biography
Louis-Philippe Morency is a tenure-track Faculty at the CMU Language Technologies Institute, where he leads the Multimodal Communication and Machine Learning Laboratory (MultiComp Lab). His research focuses on building computational foundations for analyzing, recognizing, and predicting subtle human communicative behaviors during social interactions, with multimodal machine learning at the technical core.

Prof. Gül Varol

Gul Varol
Affiliation
École des Ponts ParisTech, France
Biography
Gül Varol is a permanent researcher in the IMAGINE team at École des Ponts ParisTech. She received her PhD from the WILLOW team of Inria Paris and École Normale Supérieure (ENS); her thesis was awarded by ELLIS and AFRIF. She served as a Program Chair at ECCV'24. Her research interests cover vision and language applications, including video representation learning, human motion synthesis, and sign languages.

Prof. Georgios Pavlakos

Georgios Pavlakos
Affiliation
University of Texas at Austin, USA
Biography
Georgios Pavlakos is an Assistant Professor in the Department of Computer Science at UT Austin. He was previously a postdoctoral researcher at UC Berkeley and received his PhD from the University of Pennsylvania, where his dissertation received the Morris and Dorothy Rubinoff Award. His research is at the intersection of computer vision, machine learning, and robotics, with a focus on 3D human pose and shape estimation, visual understanding, and embodied AI systems.

Accepted Papers

To be announced after the review process.


Organising Committee

Rajmund Nagy
Rajmund Nagy
KTH Royal Institute of Technology, Sweden
Tu Anh Nguyen
Tu Anh Nguyen
Meta, France
Sindhu B. Hegde
Sindhu B. Hegde
University of Oxford, UK
Zeyi Zhang
Zeyi Zhang
Peking University, China
Leore Bensabath
Leore Bensabath
École des Ponts, France
Gustav Eje Henter
Gustav Eje Henter
KTH Royal Institute of Technology & Motorica AB, Sweden
Michael Neff
Michael Neff
University of California, Davis, USA

Contact

For any questions about the workshop, please contact: jomat@meta.com.