Mingfei Chen

[Jun. 2026] Join NVIDIA Research as research intern working on robotic action modeling conditioned on multimodal spatial perception, working with Dr. Shalini De Mello and Dr. Koki Nagano.

[Jun. 2026] My first-author paper on vision-language motion modeling, EgoMAN, got accepted by ECCV2026!

[Mar. 2026] Passed my PhD general exam!

[Oct. 2025] I'm honored to be awarded the Google PhD Fellowship 2025 in Machine Perception (North America)! [Read more]

[Sep. 2025] One first-author paper on spatial audio-visual LLMs got accepted by NeurIPS as Oral (acceptance rate < 0.4%)!

[Jun. 2025] Join Meta Reality Labs again in Redmond, working on reasoning from multi-modal LLMs to human manipulation from egocentric videos!

[Mar. 2025] One first-author paper on spatial audio-visual reconstruction got accepted by CVPR2025 as Highlight (acceptance rate < 2.9%)!

[Sep. 2024] One first-author paper on spatial audio-visual reconstruction got accepted by NeurIPS2024!

[Jun. 2024] Join Meta Reality Labs in Pittsburgh (now XRCIA Social AI Research group) as a research scientist intern, working with Dr. Israel D. Gebru and Dr. Alexander Richard.

[May. 2024] Passed my PhD qualify exam!

[Sep. 2023] Present at ICCV2023 AV4D workshop!

[Sep. 2023] Start my Ph.D journey at UW ECE department, NeuroAI Lab!

[Jul. 2023] One first-author paper on audio-visual learning got accepted by ICCV2023!

[Sep. 2022] One co-first author paper on implicit neural acoustic fields got accepted by NeurIPS2022!

[Jul. 2022] One first-author paper on 3D photo-realistic digital human rendering got accepted by ECCV2022!

[Jan. 2022] Join NeuroAI Lab, work with Prof. Eli Shlizerman on audio-visual related research.

[Sep. 2021] Join ECE department at University of Washington, Seattle, as a master student.

[Jun. 2021] Join Sea AI Lab and NUS Learning and Vision Lab as research intern, work with Prof. Shuicheng Yan and Prof. Jiashi Feng on 3D photo-realistic digital human rendering.

[Mar. 2021] One first-author paper on human-object interaction got accepted by CVPR2021!

[Jul. 2020] Join Sensetime Research as research intern, work on human-object interaction.

[Jun. 2020] My thesis on Language-guided Video Retrieval was awarded with Outstanding undergraduate graduation thesis of Huazhong University of Science and Technology!

[Sep. 2019] Join Bytedance AI Lab as Computer Vision Algorithm Intern.

[Jul. 2019] Join CUHK (Shenzhen) as research assistant, work with Prof. Chang Wen Chen and Prof. Junsong Yuan on Language-guided Video Retrieval.

Omni-Modal Foundation Models for Realistic 4D Dynamic Scenes:

Multi-Modal LLM for 3D Spatial Reasoning:

Spatial Audio-Visual for 3D Scenes:

3D Vision:

Visual Relationship: