ENGLISH / MAGYAR
Kövess
minket

Diffusion-Based Autonomous Driving with Multi-Modal Perception and Emergency Interaction

2025-2026/I.
Dr. Liu Chang

Description

Autonomous vehicles must operate safely in complex traffic and emergency scenarios. This lab focuses on reproducing DiffusionDrive, a diffusion-based end-to-end driving policy, and extending it to handle multi-modal perception inputs (LiDAR + camera) and human-vehicle interaction in emergency scenarios. Students will first implement and evaluate the planner in simulated driving environments, then integrate multi-modal perception to improve robustness under varying conditions. Additionally, students will design interfaces for human intervention using hand gestures or voice commands, enabling real-time, adaptable autonomous driving in critical situations.

Objectives

Adapt DiffusionDrive to handle multi-modal inputs, combining LiDAR point clouds with RGB camera data.
Evaluate the system’s performance under varying sensor conditions, lighting, and dynamic obstacles.
Implement human-vehicle interaction mechanisms for emergency intervention (e.g., hand gestures, voice commands).
Test the system’s ability to generate safe and diverse driving actions in emergency scenarios.

Expected Outcomes

Hands-on experience with diffusion models for autonomous driving.
Implementation of a multi-modal, interactive end-to-end driving policy.
Experimental evaluation of real-time planning and human-in-the-loop emergency control.

Skills Required

Python and PyTorch programming.
Basic knowledge of robotics, autonomous driving, or sensor fusion.
Interest in human-robot interaction, multi-modal perception, and real-time control.


2
0