I’m a third-year Ph.D. student in Multimedia Lab, CUHK, supervised by Prof. Hongsheng Li & Prof. Xiaogang Wang.

My research interest includes multimodal reasoning, agentic vision, and unified understanding & generation MLLM. Please email me if you want to collaborate for academic research or have any questions.

I will be entering the job market in 2027. Please feel free to reach out if you have opportunities!

šŸ”„ News

šŸ“ Selected Publications

Multi-modal Reasoning & Agentic Vision

Multi-modal Reasoning for Generation

All Publications

Multi-modal Reasoning for Understanding & Generation

Agentic Multi-modal Generation

Multimodal Large Language Models

Diffusion Models

Autonomous Driving

šŸ› ļø Projects

šŸ’¼ Experience

  • 2025.10 - Present: Research Intern, Seed, ByteDance
  • 2022.11 - 2024.05: Research Intern, Base Model Group, Sensetime

šŸŽ“ Education

  • 2023.08 - Present: Ph.D. student in Multimedia Lab, CUHK
  • 2019.09 - 2023.06: B.Eng. in Computer Science and Technology, Harbin Institute of Technology, Shenzhen

šŸ† Awards

  • 2020, 2022: National Scholarship, Ministry of Education, China