I am Jianzong Wu (吴健宗) and I am a PhD Student at School of Intelligence Science and Technology, Peking University (PKU), advised by Prof. Yunhai Tong. Previously, I obtained my bachelor’s degree at University of Science and Technology of China (USTC).

My research interest focuses on multi-modal learning, including feature alignment, scene understanding and content generation. So far, I have conducted research works on referring image segmentation, open vocabulary image segmentation, text-to-image editting task, multi-modal large language models, as well as several related fields.

🔥 News

  • 2024.02:  🎉🎉 Towards Language-Driven Video Inpainting via Multimodal Large Language Models is accepted by CVPR!
  • 2024.02:  🎉🎉 Towards Robust Referring Image Segmentation is accepted by TIP!
  • 2024.01:  🎉🎉 Towards Open Vocabulary Learning: A Survey is accepted by TPAMI!
  • 2023.07:  🎉🎉 CGG is accepted by ICCV-2023!

📝 Publications

Full publications including arXiv papers can be seen here

* means equal contribution.

Selected Publications

NeurIPS 2024 Spotlight
sym

MotionBooth: Motion-Aware Customized Text-to-Video Generation

Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen

Project | Code

  • We present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements.
CVPR 2024
sym

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Zewei Liu, Chen Change Loy

Project | Code

  • Novel language-driven video inpainting task, dataset, and model.
TPAMI
sym

Towards Open Vocabulary Learning: A Survey

Jianzong Wu*, Xiangtai Li*, Shilin Xu*, Haobo Yuan, Henghui Ding, Xia Li, Jiangning Zhang, Yunhai Tong, Xudong Jiang, Bernard Ghanem, Dacheng Tao

Code

  • A survey on open vocabulary learning.
ICCV 2023
sym

Betrayed by Captions: Joint Caption Grounding and Generation for Open Vocabulary Instance Segmentation

Jianzong Wu*, Xiangtai Li*, Henghui Ding, Xia Li, Guangliang Cheng, Yunhai Tong, Chen Change Loy

Code

  • Query-based open vocabulary segmentation aided by caption generation.
TIP
sym

Towards Robust Referring Image Segmentation

Jianzong Wu, Xiangtai Li, Xia Li, Henghui Ding, Yunhai Tong, Dacheng Tao

Code

  • Novel robust referring image segmentation (R-RIS) task, dataset, and model.

📖 Educations

  • 2021.07 - now, PhD Student in Peking University (PKU)
  • 2017.09 - 2021.07, Bachelor in University of Science and Technology of China (USTC)

💻 Internships