Selected Publications
* co-first author, † corresponding author. Full list on Google Scholar.
Preprints
Purify Once, Edit Freely: Breaking Image Protections under Model Mismatch
2026.03
Qichen Zhao, Shengfang Zhai†, Xinjian Bai, Qingni Shen, Qiqi Lin, Yansong Gao, Zhonghai Wu
MemPot: Defending Against Memory Extraction Attack with Optimized Honeypots
2026.02
Yuhao Wang, Shengfang Zhai†, Guanghao Jin, Yinpeng Dong, Linyi Yang, Jiaheng Zhang
Life-Cycle Routing Vulnerabilities of LLM Router
2025.03
Qiqi Lin, Xiaoyang Ji, Shengfang Zhai†, Qingni Shen, Zhi Zhang, Yuejian Fang, Yansong Gao
Conferences
Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries
ICLR 2026
Yuhao Wang*, Wenjie Qu*, Shengfang Zhai*†, Yanze Jiang, Zichen Liu, Yue Liu, Yinpeng Dong, Jiaheng Zhang
Unshaken by Weak Embedding: Robust Probabilistic Watermarking for Dataset Copyright Protection
NDSS 2026
Shang Wang, Tianqing Zhu, Dayong Ye, Hua Ma, Bo Liu, Ming Ding, Shengfang Zhai, Yansong Gao
GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning
NeurIPS 2025
Yue Liu, Shengfang Zhai, Mingzhe Du, Yulin Chen, Tri Cao, Hongcheng Gao, Cheng Wang, Xinfeng Li, Kun Wang, Junfeng Fang, Jiaheng Zhang, Bryan Hooi
Efficient Input-level Backdoor Defense on Text-to-Image Synthesis via Neuron Activation Variation
ICCV 2025 Highlight (~2.3% acceptance rate)
Shengfang Zhai, Jiajun Li, Yue Liu, Huanran Chen, Zhihua Tian, Wenjie Qu, Qingni Shen, Ruoxi Jia, Yinpeng Dong, Jiaheng Zhang
Membership Inference on Text-to-Image Diffusion Models via Conditional Likelihood Discrepancy
NeurIPS 2024
Shengfang Zhai, Huanran Chen, Yinpeng Dong, Jiajun Li, Qingni Shen, Yansong Gao, Hang Su, Yang Liu
Text-to-Image Diffusion Models Can Be Easily Backdoored through Multimodal Data Poisoning
ACM MM 2023 Oral
Shengfang Zhai, Yinpeng Dong, Qingni Shen, Shi Pu, Yuejian Fang, Hang Su
NCL: Textual Backdoor Defense Using Noise-Augmented Contrastive Learning
ICASSP 2023
Shengfang Zhai, Qingni Shen, Xiaoyi Chen, Weilong Wang, Cong Li, Yuejian Fang, Zhonghai Wu
Kallima: A Clean-Label Framework for Textual Backdoor Attacks
ESORICS 2022
Xiaoyi Chen, Yinpeng Dong, Zeyu Sun, Shengfang Zhai, Qingni Shen, Zhonghai Wu
Automated Extraction of ABAC Policies from Natural-Language Documents in Healthcare Systems
IEEE BIBM 2022
Yutang Xia, Shengfang Zhai, Qinting Wang, Huiting Hou, Zhonghai Wu, Qingni Shen