About

Yikai Wang is a postdoctoral researcher at Meta, working with Prof. Tao Xiang on advancing vision and generative modeling techniques. Prior to joining Meta, he was a research fellow at MMLab@NTU, Nanyang Technological University, under the guidance of Prof. Chen Change Loy, where he investigated the structural modeling of image generation. He earned his Ph.D. in Statistics and B.Sc. in Mathematics from Fudan University, advised by Prof. Yanwei Fu, establishing a robust foundation in generative computer vision and statistical machine learning.

His research interests spans foundation models, computer vision, and statistical machine learning. He focuses primarily on generative vision, multimodal intelligence, and subset selection. He is dedicated to building interpretable, scalable, and data-efficient models capable of generating, manipulating, and understanding complex visual environments.

I am always open to academic discussions and collaboration. Please feel free to reach out via yi-kai.wang@outlook.com. Please let me know who you are.
Note: My previous institutional emails from NTU and Fudan are no longer active.

News

[2026.01] Our work on ActiveVLA has been accepted by CVPR 2026. Congrats to Zhenyang!
[2026.01] Our works on next visual granularity generation and camera-centric generation have been accepted by ICLR 2026!
[2025.10] Our work on subject repositioning has been awarded a J2C Certification (10% of accepted papers) by TMLR!
[2025.08] Our work on next visual granularity generation is available on arxiv!
[2025.06] Our work on spatial-temporal aware diffusion policy learning has been accepted by ICCV 2025. Congrats to Zhenyang!
[2025.05] Glad to be recognized as an outstanding reviewer for CVPR 2025!
[2025.05] We are organizing Mobile Intelligent Photography and Imaging workshop at ICCV 2025!
[2025.04] I am glad to be invited to serve as an area chair for NeurIPS 2025.
[2025.04] Our work on image inpainting has been selected as a highlight paper by CVPR 2025!
[2025.03] Received a grant from NVIDIA to support my ongoing research!
[2025.02] Our works on image inpainting and visual grounding have been accepted by CVPR 2025. Congrats to Zhenyang, and a big thank you to my co-authors!
[2025.01] Our works on long-trajectory empty street reconstruction and transformer pruning have been accepted by ICLR 2025. Congrats to Jingwei and Yizhuo!
[2024.12] Attending NeurIPS 2024 at Vancouver, Canada.
[2024.11] Our work on fMRI-vision latent alignment has been accepted by TMLR. Congrats to Prof. Qian!
[2024.10] Our work on subject repositioning has been accepted by TMLR!
[2024.09] Our works on lexical visual-language alignment and visual in-context prompt selection have been accepted by NeurIPS 2024. Congrats to Yifan, Chengming and Chen!
[2024.07] I have joined the MMLab@NTU as a postdoc! Looking forward to this new journey!
[2024.07] Our work on consistent fMRI decoding has been accepted by ECCV 2024. Congrats to Jingyang!
[2024.05] I have successfully defended my Ph.D. thesis!

Research

Generative Vision:
- Generation: next visual granularity generation [NVG, ICLR2026], camera-centric generation [Puffin, ICLR2026], amodal completion [C2F-Seg, ICCV2023];
- Inpainting: context-stability & color-consistency [ASUKA, CVPR2025(Highlight), extension], subject repositioning [SEELE, TMLR2024 (J2C Certification)], ref-guided inpainting [LeftRefill, CVPR2024], empty street reconstruction [StreetUnveiler, ICLR2025];
Multimodal Intelligence:
- Vision-action: active perception [ActiveVLA, CVPR2026]. spatial-temporal aware policy learning [DP4, ICCV2025];
- Vision-language: visual grounding and reasoning [ReasonGrounder, CVPR2025], lexical alignment [LexVLA, NeurIPS2024];
- Vision-fMRI: cross-subject zero-shot decoding [preprint], consistent fMRI decoding [NeuroPictor, ECCV2024], latent alignment [TMLR2024].
Subset Selection:
- Analysis: false-selection-rate control [Knockoffs-SPR, TPAMI2023], noisy set recovery [ICI, TPAMI2021];
- Application (sparse incidental parameters): out-of-distribution detection [CVPR2024], learning with noisy labels [SPR, CVPR2022], few-shot learning [ICI, CVPR2020];
- Application (other statistical models): network pruning [ICLR2025], in-context-learning [Partial2Global, NeurIPS2024].

Papers

	Aligned Stable Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency [arxiv] Yikai Wang, Junqiu Yu, Chenjie Cao, Xiangyang Xue, Yanwei Fu. Preprint, 2026.
	The Pictorial Cortex: Zero-Shot Cross-Subject fMRI-to-Image Reconstruction via Compositional Latent Modeling. [arxiv] Jingyang Huo, Yikai Wang, Yanwei Fu, Jianfeng Feng. Preprint, 2026.
	ActiveVLA: Injecting Active Perception into Vision-Language-Action Models for Precise 3D Robotic Manipulation. [arxiv] [paper] [code] [data] [intro] Zhenyang Liu, Yongchong Gu, Yikai Wang, Xiangyang Xue, Yanwei Fu. CVPR, 2026.
	Next Visual Granularity Generation. [arxiv] [paper] [code] [intro] [简介] Yikai Wang, Zhouxia Wang, Zhonghua Wu, Qingyi Tao, Kang Liao, Chen Change Loy. ICLR, 2026.
	Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation. [arxiv] [paper] [code] [intro] [简介] Kang Liao, Size Wu, Zhonghua Wu, Linyi Jin, Chao Wang, Yikai Wang, Fei Wang, Wei Li, Chen Change Loy. ICLR, 2026.
	Spatial-Temporal Aware Visuomotor Diffusion Policy Learning. [arxiv] [paper] [code] [intro] Zhenyang Liu, Yikai Wang, Kuanning Wang, Longfei Liang, Xiangyang Xue, Yanwei Fu. ICCV, 2025.
	Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency. [arxiv] [paper] [full-size PDF] [code & MISATO dataset] [intro] [demo: Youtube, Bilibili] Yikai Wang, Chenjie Cao, Junqiu Yu, Ke Fan, Xiangyang Xue, Yanwei Fu. CVPR*, 2025. (Highlight, 13.5% of accepted papers)
	ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning. [arxiv] [paper] [code & dataset] [intro] Zhenyang Liu, Yikai Wang, Sixiao Zheng, Tongying Pan, Longfei Liang, Yanwei Fu, Xiangyang Xue. CVPR, 2025.
	Adaptive Pruning of Pretrained Transformer via Differential Inclusions. [arxiv] [paper] [code] Yizhuo Ding, Ke Fan, Yikai Wang, Xinwei Sun, Yanwei Fu. ICLR, 2025.
	3D StreetUnveiler with Semantic-aware 2DGS - a simple baseline. [arxiv] [paper] [code] [intro] Jingwei Xu, Yikai Wang, Yiqun Zhao, Yanwei Fu, Shenghua Gao. ICLR, 2025.
	Repositioning the Subject within Image. [arxiv] [paper] [full-size PDF] [code & ReS dataset] [intro] [demo: Youtube, Bilibili] Yikai Wang, Chenjie Cao, Ke Fan, Qiaole Dong, Yifan Li, Xiangyang Xue, Yanwei Fu. TMLR, 2024. (J2C Certification, 10% of accepted papers)
	LEA: Learning Latent Embedding Alignment Model for fMRI Decoding and Encoding. [arxiv] [paper] [code] Xuelin Qian, Yikai Wang, Xinwei Sun, Yanwei Fu, Xiangyang Xue, Jianfeng Feng. TMLR**, 2024.
	Unified Lexical Representation for Interpretable Visual-Language Alignment. [arxiv] [paper] [code] [intro] Yifan Li, Yikai Wang, Yanwei Fu, Dongyu Ru, Zheng Zhang, Tong He. NeurIPS, 2024.
	Towards Global Optimal Visual In-Context Learning Prompt Selection. [arxiv] [paper] [code] [intro] Chengming Xu, Chen Liu, Yikai Wang, Yuan Yao, Yanwei Fu. NeurIPS, 2024.
	NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level Modulation. [arxiv] [paper] [code] [intro] Jingyang Huo, Yikai Wang*, Yun Wang, Xuelin Qian, Chong Li, Yanwei Fu, Jianfeng Feng. ECCV, 2024.
	LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model. [arxiv] [paper] [code] [intro] Chenjie Cao, Yunuo Cai, Qiaole Dong, Yikai Wang, Yanwei Fu. CVPR, 2024.
	Test-Time Linear Out-of-Distribution Detection. [paper] [code] Ke Fan, Tong Liu, Xingyu Qiu, Yikai Wang, Lian Huai, Zeyu Shangguan, Shuang Gou, Fengjian Liu, Yuqian Fu, Yanwei Fu, Xingqun Jiang. CVPR, 2024.
	Coarse-to-Fine Amodal Segmentation with Shape Prior. [arxiv] [paper] [code] [intro] Jianxiong Gao, Xuelin Qian, Yikai Wang, Tianjun Xiao, Tong He, Zheng Zhang, Yanwei Fu. ICCV, 2023.
	Knockoffs-SPR: Clean Sample Selection in Learning with Noisy Labels. [arxiv] [paper] [code] [intro] [简介] Yikai Wang, Yanwei Fu, Xinwei Sun. TPAMI, 2023.
	Scalable Penalized Regression for Noise Detection in Learning with Noisy Labels. [arxiv] [paper] [code] [intro] [简介] Yikai Wang, Xinwei Sun, Yanwei Fu. CVPR, 2022.
	How to Trust Unlabeled Data? Instance Credibility Inference for Few-Shot Learning. [arxiv] [paper] [code] [intro] [简介] Yikai Wang, Li Zhang, Yuan Yao, Yanwei Fu. TPAMI, 2021.
	Instance Credibility Inference for Few-Shot Learning. [arxiv] [paper] [code] [intro] [简介] Yikai Wang, Chengming Xu, Chen Liu, Li Zhang, Yanwei Fu. CVPR, 2020.

Talks

	Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model. [slides] [Bilibili video (in Chinese, starts at 1:18:40)] Oral presentation at Pre-CVPR@Shanghai, 2024.05.
	Advancing Image Inpainting: From Versatility to Consistency. [slides] S-Lab at Nanyang Technological University, 2024.04.
	Clean Sample Selection Algorithms with Statistical Sparsity Analysis. [slides] EML Munich Group at Technical University of Munich, 2024.03; MLCV Group at Institute of Science and Technology Austria, 2024.04.
	Few-shot Learning by Statistical Methods. [slides] CVPR 2023 tutorial, 2023.06: Few-shot Learning from Meta-Learning, Statistical Understanding to Applications.
	Sparse Learning for Noisy Data Detection. [Youtube][Bilibili][slides] CVPR 2022 tutorial, 2022.06: Sparse Learning in Neural Networks and Robust Statistical Analysis.

Grants and Awards

Academic Grant, NVIDIA, 2025.
Outstanding Reviewer: NeurIPS 2022, CVPR 2025.
Outstanding Graduates, Shanghai, 2024.
Outstanding Freshmen, Fudan University, 2015.

Timeline

Postdoctoral Researcher, 2025 - present, Meta.
Research Fellow, 2024 - 2025, Nanyang Technological University.
Ph.D. in Statistics, 2019 - 2024, Fudan University.
B.S. in Mathematics, 2015 - 2019, Fudan University.

Service

Area chair: NeurIPS 2025.
Reviewer:
- CVPR 2023, 2024, 2025, 2026;
- ICCV 2023, 2025;
- ECCV 2024;
- SIGGRAPH 2026;
- NeurIPS 2022, 2023, 2024;
- ICLR 2024, 2025, 2026;
- TPAMI 2025;
- TIP 2024;
- TMLR 2023, 2024;
- ICML 2023, 2024.
Co-organizer:
- ICCV 2025 workshop: Mobile Intelligent Photography and Imaging;
- CVPR 2023 tutorial: Few-shot Learning from Meta-Learning, Statistical Understanding to Applications;
- CVPR 2022 tutorial: Sparse Learning in Neural Networks and Robust Statistical Analysis.

Teaching

Neural Network and Deep Learning, TA, 2019-2022 Springs, Fudan University.