I am a fifth year PhD student in the Department of Automation at Tsinghua University, advised by Prof. Jiwen Lu .
In 2020, I obtained my B.Eng. in the Department of Electronic Engineering, Tsinghua University.
I also obtained B.Admin. as dual degree in the School of Ecnomics and Management, Tsinghua University.
I am broadly interested in computer vision and deep learning. My current research focuses on 3D vision, 3D generation and 4D world model.
OGGSplat is designed to expand the field-of-view of the Gaussian-based 3D scene reconstructed from sparse views and feedforward / generalizable models.
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting Ziyi Wang*,
Yanran Zhang*,
Jie Zhou ,
Jiwen Lu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[arXiv][Code]
UniPre3D is a unified pre-training method that can be applied to both object-level and scene-level point clouds. It is supported by cross-modal Gaussian splatting technique.
XMask3D: Cross-modal Mask Reasoning for Open Vocabulary 3D Semantic Segmentation Ziyi Wang*,
Yanbo Wang*,
Xumin Yu,
Jie Zhou ,
Jiwen Lu Conference on Neural Information Processing Systems (NeurIPS), 2024
[arXiv][Code]
XMask3D is a framework that propose mask-level reasoning techniques to empower 3D segmentation model with open vocabulary capacity under the assistance of the pre-trained 2D mask generator.
P2P is a framework to leverage large-scale pre-trained image models for 3D point cloud analysis.
SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation Ziyi Wang,
Yongming Rao,
Xumin Yu,
Jie Zhou ,
Jiwen Lu IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[arXiv][Code]
We present Semantic-Affine Transformation that transforms decoder mid-level features of the encoder-decoder segmentation network with class-specific affine parameters.
PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers Xumin Yu*,
Yongming Rao*,
Ziyi Wang, Zuyan Liu,
Jiwen Lu ,
Jie Zhou IEEE International Conference on Computer Vision (ICCV), 2021
Oral Presentation [arXiv][Code][中文解读]
PoinTr is a transformer-based framework that reformulates point cloud completion as a set-to-set translation problem.
We present a deep interpretable metric learning (DIML) that adopts a structural matching strategy to explicitly aligns the spatial embeddings by computing an optimal matching flow between feature maps of the two images.
PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds Yi Wei *,
Ziyi Wang*,
Yongming Rao*,
Jiwen Lu , Jie Zhou IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021
[arXiv][Code]
We present point-voxel correlation fields for 3D scene flow estimation which migrates the high performance of RAFT and provides a solution to build structured all-pairs correlation fields for unstructured point clouds.
Teaching
Teaching Assistant, Computer Vision, 2024 Spring Semester
Teaching Assistant, Pattern Recognition and Machine Learning, 2022 Fall Semester
Honors and Awards
2024 National Scholarship, Tsinghua University
2023 ChangXin Memory Scholarship, Tsinghua University
2023 CVPR Outstanding Reviewer
2021 Haining Talent Scholarship, Tsinghua University
2020 Excellent graduation thesis, Tsinghua University
2018 Zheng Geru Scholarship, Tsinghua University
2017 Hongqian Electronics Scholarship, Tsinghua University