Simulation Applications and Biotechnology Research

Virtual avatar generation models as world navigators

A Visual Prompt-Driven Space-Time Diffusion Transformer for Context-Aware Avatar Motion Generation

Architecture

Introduction

We present a video model capable of simulating human movement within a given environment by assuming the parameters of a virtual avatar. Our model, a diffusion transformer applied to the human motion domain, features two key design choices: predicting the sample instead of noise in each diffusion step, and ingesting entire videos to predict complete motion sequences. We focus on rock climbing environments because they are controlled, single-agent, and static, yet encompass many complex biomechanical interactions found in various real-world scenarios. We aptly name our model SABR-CLIMB. Our results indicate that training a virtual avatar generation model on real-world environment videos, leveraging substantial computational resources, and enhancing dataset size and quality are promising strategies for developing general-purpose virtual avatars capable of navigating the world from a human perspective without any reward design or skill primitives.

Few-Shot Generation

Using annotated objects in images as visual prompts and a video of an empty environment, SABR-CLIMB generates plausible avatar movements that consider the annotated objects. Below are examples of indoor and outdoor routes, captured with various videographies.

Zero-Shot Generation


Even without any visual prompting and using only a video of an empty environment, SABR-CLIMB can generate plausible avatar movements with the right context. Below are examples of this "zero-shot" generation behavior.

Authors

1SABR

(*): First author

Get in Touch

SABR develops advanced simulation tools to test how products and systems interact with humans. Our tools support a wide range of applications, including humanoid robots, neuromodulation devices, and other biomedical and healthcare innovations. If your company is interested, please fill out this form.