Xi CHEN

I am a second-year Ph.D. student at the University of Hong Kong, supervised by Prof. Hengshuang Zhao. Before joining HKU, I worked as a senior algorithm engineer at Alibaba Group. Previously, I got the Master degree at Zhejiang University in 2020. I also got a double master diploma at Ecole Centrale Mediterranee(France). Before that, I received my B.Eng. from Zhejiang University in 2017.

My research interests lie in the field of deep learning and computer vision, I've published multiple research works for image/video perception, open-world multi-modal learning. Now I have interests in AIGC.

I've done internship or cooperations with many companies/orginizations like ShLab, Megvii, SenseTime, Hikvision, Ufoto, AISegment, LIS(France), etc.

Reach out to me over gmail: chauncey0620 for discussion or any opportunities.

CV  /  Google Scholar  /  Github  /  Zhihu

profile photo
News
  • [Dec. 2023] We release LivePhoto for image-to-video generation.
  • [Dec. 2023] The code of AnyDoor is released here, we would continue to make it stronger.
  • [Jul. 2023] We release AnyDoor for zero-shot image composition & customization.
  • [Jul. 2023] OPSNet is accepted by ICCV2023.
  • [Mar. 2023] We release the manuscripts of OPSNet and ScribbleSeg.
  • [Mar. 2023] One co-authored paper accepted to CVPR2023.
  • [Feb. 2022] I join HKU as a PhD student.
  • ---- show more ----
Selected Publications ( All Papers )

Google Scholar

(*: Equal contribution)

LivePhoto: Real Image Animation with Text-guided Motion Control
Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao
arxiv, 2023
pdf/ page/ code/ media (AK)

We present LivePhoto, a real image animation method with text control. Different from previous works, LivePhoto truely listens to the text instructions and well preserves the object-ID.

AnyDoor: Zero-shot Object-level Image Customization
Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao
arxiv, 2023
pdf/ page/ code/ media (量子位)

This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations in a harmonious way.

Open-vocabulary Panoptic Segmentation with Embedding Modulation
Xi Chen, Shuang Li, Ser-Nam Lim, Antonio Torralba, Hengshuang Zhao
ICCV, 2023
pdf/ page

We present a omnipotent and efficient framework for open-vocabulary panoptic segmentation, which shows great performance for both closed- and open-vocabulary settings with limited training data.

FocalClick: Towards Practical Interactive Image Segmentation
Xi Chen, Zhiyan Zhao, Yilei Zhang, Manni Duan, Donglian Qi, Hengshuang Zhao
CVPR, 2022
pdf / code

FocalClick is a simple and effective solution for interactive segmentation. It largely reduces the computation for various models by focusing on target local regions.

Conditional Diffusion for Interactive Segmentation
Xi Chen, Zhiyan Zhao, Feiwu Yu, Yilei Zhang, Manni Duan
ICCV, 2021
pdf / code

We view interactive segmentation as a diffusion procedure and design feature- and pixel-level diffuion modules for more consistent predictions.

State-Aware Tracker for Real-Time Video Object Segmentation
Xi Chen, Zuoxin Li, Ye Yuan, Gang Yu, Jianxin Shen, Donglian Qi
CVPR, 2020
pdf / code

We propose a novel pipeline called State-Aware Tracker (SAT), which can produce accurate segmentation results with real-time speed.

Projects & Resources
Siamese Fully Convolutional Object Tracking
Weizhao Wang, Xinyu Chen, Xi Chen, Yinda Xu, Zeyu Wang
pdf / code

Second place solution for VOT2019 real-time track. SiamFCOT serves as a strong pipeline for real-time single object tracking.

Academic Service

Reviewer / Program Committee Member

  • CVPR (2021, 2022, 2023, 2024)
  • ICCV (2021, 2023)
  • ECCV (2022)
  • NeurIPS (2023)
  • ICLR (2023)
  • AAAI (2022, 2023)

Design and source code from Jon Barron's website