News
2024-07: One paper is accepted to ECCV 2024 .
2024-01: One paper is accepted to ICLR 2024 as Spotlight presentation.
2023-09: One paper is accepted to NeurIPS 2023 .
2023-04: One paper is accepted to ICML 2023 .
2023-01: One paper is accepted to ICLR 2023 .
Publications
* indicates equal contribution
DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation
Yuang Peng *,
Yuxin Cui *,
Haomiao Tang *,
Zekun Qi ,
Runpei Dong ,
Jing Bai ,
Chunrui Han ,
Zheng Ge ,
Xiangyu Zhang ,
Shu-Tao Xia
arXiv preprint, 2024
[arXiv]
[Project]
[Code]
We collect diverse images and prompts, and utilize GPT-4o for automated evaluation aligned with human preference.
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi ,
Runpei Dong ,
Shaochen Zhang ,
Haoran Geng ,
Chunrui Han ,
Zheng Ge ,
He Wang ,
Li Yi ,
Kaisheng Ma
European Conference on Computer Vision (ECCV ), 2024
[arXiv]
[Project]
[Code]
[Huggingface]
We present ShapeLLM, the first 3D Multimodal Large Language Model designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.
DreamLLM: Synergistic Multimodal Comprehension and Creation
Runpei Dong *,
Chunrui Han *,
Yuang Peng ,
Zekun Qi ,
Zheng Ge ,
Jinrong Yang ,
Liang Zhao ,
Jianjian Sun ,
Hongyu Zhou ,
Haoran Wei ,
Xiangwen Kong ,
Xiangyu Zhang ,
Kaisheng Ma ,
Li Yi
International Conference on Learning Representations (ICLR ), 2024, Spotlight
[arXiv]
[Project]
[Code]
[Huggingface]
We present DreamLLM, a learning framework that first achieves versatile Multimodal Large Language Models empowered with frequently overlooked synergy between multimodal comprehension and creation.
VPP⚡: Efficient Conditional 3D Generation via Voxel-Point Progressive Representation
Zekun Qi *,
Muzhou Yu *,
Runpei Dong ,
Kaisheng Ma
Conference on Neural Information Processing Systems (NeurIPS ), 2023
[arXiv]
[Code]
[OpenReview]
We achieve rapid, multi-category 3D conditional generation by sharing the merits of different representations. VPP can generate 3D shapes less than 0.2s using a single RTX 2080Ti.
Point-GCC: Universal Self-supervised 3D Scene Pre-training via Geometry-Color Contrast
Guofan Fan ,
Zekun Qi ,
Wenkai Shi ,
Kaisheng Ma
arXiv preprint, 2023
[arXiv]
[Code]
We enhance the utilization of color information to improve 3D scene self-supervised learning.
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining
Zekun Qi *,
Runpei Dong *,
Guofan Fan ,
Zheng Ge ,
Xiangyu Zhang ,
Kaisheng Ma ,
Li Yi
International Conference on Machine Learning (ICML ), 2023
[arXiv]
[Code]
[OpenReview]
We propose contrast guided by reconstruct to mitigate the pattern differences between two self-supervised paradigms.
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Runpei Dong ,
Zekun Qi ,
Linfeng Zhang ,
Junbo Zhang ,
Jianjian Sun ,
Zheng Ge ,
Li Yi ,
Kaisheng Ma
International Conference on Learning Representations (ICLR ), 2023
[arXiv]
[Code]
[OpenReview]
We propose to use autoencoders as cross-modal teachers to transfer dark knowledge into 3D representation learning.
Honors and Awards
2022 Outstanding Graduate, Xi’an Jiaotong University
2021 Annual Spiritual Civilization Award , Xi’an Jiaotong University
2020 National runner-up of the China Undergraduate Physics Tournament (CUPT) as the team leader
2019 Chen Qi Scholarship, Xi’an Jiaotong University