Hello! I am a Ph.D. student in Computer Science at Boston University, where I am fortunate to be advised by Prof. Boqing Gong.
I completed my Bachelor's degree in Computer Science and Technology at Tsinghua University. My research focuses on computer vision, with a particular interest in vision-language foundation models and their
applications in understanding and generating multimodal content.
BabyVLM-V2 is an infant-inspired vision-language framework with a longitudinal pretraining corpus and a cognitive benchmark,
showing that compact models trained from scratch can achieve competitive performance.
We curate a benchmark combining a bottom-up survey and a top-down strategy drawing upon three classification systems
in cognitive science, including 31 computer vision datasets and 8000+ evaluation samples.
Our single-view 3D reconstruction method, Slice3D, predicts multi-slice images to reveal occluded parts without
changing the camera (in contrast to multi-view synthesis), and then lifts the slices into a 3D model.
Education
Boston University, Boston, MA
Ph.D. in Computer Science, 2024 - Present
Advisor: Prof. Boqing Gong
Tsinghua University, Beijing, China
B.Eng. in Computer Science and Technology, 2020 - 2024
Miscellanea
I enjoy photography in my spare time. Check out my gallery!
I have served as reviewer for various conferences and journals including CVPR, ICCV.