Shuai Shao (邵帅)
|
|
Principal Researcher, Tencent Hunyuan |
Biography
Shuai Shao, currently serving as a Principal Researcher at Tencent Hunyuan, reporting to Dr. Qinglin Lu. His research endeavors are deeply rooted in the realms of Multimodal Content Understanding and AI-Generated Content. Shuai is particularly passionate about developing robust systems that are applicable in real-world scenarios.
Shuai hails from Changchun, China, and earned his Bachelor of Science degree from Jilin University in 2017. Upon graduation, he embarked on his professional journey at Megvii Research, where he was fortunate to be mentored by Dr. Gang Yu and under the supervision of Dr. Jian Sun. Then he served ByteDance for four years, to be mentored by Dr. Zehuan Yuan.
Shuai has been competing in programming contests since his high school years. He earned a gold medal in the 2013 ACM-ICPC Asia Regional Contest and secured the 19th position in the 2014 ACM-ICPC World Finals. Subsequently, he served as the coach for Jilin University’s ACM-ICPC team until his graduation.
Recent News
- [Feb. 2026] One paper is accepted by CVPR 2026.
- [May 2025] One paper is accepted by Pattern Recognition.
- [Dec. 2023] One paper is accepted by AAAI 2024.
- [Apr. 2023] One paper is accepted by SIGIR 2023.
- [Feb. 2022] One paper is accepted by TIP.
- [Apr. 2019] We published a new dataset Objects365: A Large-scale, High-quality Dataset for Object Detection.
- [Mar. 2019] We are organizing a workshop, Detection In the Wild Challenge Workshop 2019, in conjunction with CVPR 2019.
- [Dec. 2018] A micro documentary about me on JSTV (in Chinese). Link
- [May 2018] We published a new dataset CrowdHuman, a benchmark for detecting human in a crowd.
Awards
- Top Winner of WIDER Face in the WIDER Challenge, 2018.
- Top Winner of Places Instance Segmentation in COCO + Places 2017 Challenges, 2017.
- 19th Place of The 2014 ACM-ICPC World Finals, 2014.
- Gold Medal (2nd Place) of The 2013 ACM-ICPC Asia Regional Contest, 2013.
Professional Activities
- Reviewer of IJCV, TIP, CVPR, ICCV, ECCV, ACM MM, SIGGRAPH Asia.
- Organizer of Detection In the Wild Challenge Workshop (DIW) at CVPR2019.
Publications
Conferences
EffectMaker: Unifying Reasoning and Generation for Customized Visual Effect Creation.
Shiyuan Yang, Ruihuang Li, Jiale Tao, Shuai Shao, Qinglin Lu, Jing Liao.
CVPR, 2026.
EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE.
Junyi Chen, Guo Longteng, Jia Sun, Shuai Shao, Zehuan Yuan, Liang Lin, Dongyu Zhang.
AAAI, 2024.
MAMO: Fine-Grained Vision-Language Representations Learning with Masked Multimodal Modeling.
Zijia Zhao*, Longteng Guo*, Xingjian He, Shuai Shao, Zehuan Yuan, Jing Liu.
SIGIR, 2023.
Objects365: A large-scale, high-quality dataset for object detection.
Shuai Shao*, Zeming Li*, Tianyuan Zhang*, Chao Peng*, Gang Yu, Xiangyu Zhang, Jing Li, Jian Sun.
ICCV, 2019.
Shape Robust Text Detection with Progressive Scale Expansion Network.
Wenhai Wang*, Enze Xie*, Xiang Li*, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao.
CVPR, 2019.
Scene Text Detection with Supervised Pyramid Context Network.
Enze Xie*, Yuhang Zang*, Shuai Shao, Gang Yu, Cong Yao, Guangyao Li.
AAAI, 2019.
Repulsion Loss: Detecting Pedestrians in a Crowd.
Xinlong Wang, Tete Xiao, Yuning Jiang, Shuai Shao, Jian Sun, Chunhua Shen.
CVPR, 2018.
Journals
ChatSearch: A dataset and a generative retrieval model for general conversational image retrieval.
Zijia Zhao, Longteng Guo, Tongtian Yue, Erdong Hu, Shuai Shao, Zehuan Yuan, Hua Huang, Jing Liu.
Pattern Recognition, 2025.
Birds of a Feather Flock Together: Category-Divergence Guidance for Domain Adaptive Segmentation.
Bo Yuan, Danpei Zhao, Shuai Shao, Zehuan Yuan, Changhu Wang.
IEEE Transactions on Image Processing, 2022.
Pre-prints
HY-WU (Part I): An Extensible Functional Neural Memory Framework and An Instantiation in Text-Guided Image Editing.
https://github.com/Tencent-Hunyuan/HY-WU, 2026.
VisionCreator: A Native Visual-Generation Agentic Model with Understanding, Thinking, Planning and Creation.
Jinxiang Lai, Zexin Lu, Jiajun He, Rongwei Quan, Wenzhe Zhao, Qinyu Yang, Qi Chen, Qin Lin, Chuyue Li, Tao Gao, Yuhao Shan, Shuai Shao, Song Guo, Qinglin Lu.
arXiv preprint arXiv:2603.02681, 2026.
OmniVideo-R1: Reinforcing Audio-visual Reasoning with Query Intention and Modality Attention.
Zhangquan Chen, Jiale Tao, Ruihuang Li, Yihao Hu, Ruitao Chen, Zhantao Yang, Xinlei Yu, Haodong Jing, Manyuan Zhang, Shuai Shao, Biao Wang, Qinglin Lu, Ruqi Huang.
arXiv preprint arXiv:2602.05847, 2026.
TAGRPO: Boosting GRPO on Image-to-Video Generation with Direct Trajectory Alignment.
Jin Wang*, Jianxiang Lu*, Guangzheng Xu, Comi Chen, Haoyu Yang, Linqing Wang, Peng Chen, Mingtao Chen, Zhichao Hu, Longhuang Wu, Shuai Shao, Qinglin Lu, Ping Luo.
arXiv preprint arXiv:2601.05729, 2026.
Rotate Your Character: Revisiting Video Diffusion Models for High-Quality 3D Character Generation.
Jin Wang*, Jianxiang Lu*, Comi Chen, Guangzheng Xu, Haoyu Yang, Peng Chen, Na Zhang, Yifan Xu, Longhuang Wu, Shuai Shao, Qinglin Lu, Ping Luo.
arXiv preprint arXiv:2601.05722, 2026.
Hunyuan-GameCraft-2: Instruction-following Interactive Game World Model.
Junshu Tang*, Jiacheng Liu*, Jiaqi Li*, Longhuang Wu, Haoyu Yang, Penghao Zhao, Siruis Gong, Xiang Yuan, Shuai Shao, Qinglin Lu.
arXiv preprint arXiv:2511.23429, 2025.
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition.
Jiaqi Li*, Junshu Tang*, Zhiyong Xu, Longhuang Wu, Yuan Zhou, Shuai Shao, Tianbao Yu, Zhiguo Cao, Qinglin Lu.
arXiv preprint arXiv:2506.17201, 2025.
Hunyuan-Game: Industrial-grade Intelligent Game Creation Model.
arXiv preprint arXiv:2505.14135, 2025.
ChatBridge: Bridging modalities with large language model as a language catalyst.
Zijia Zhao, Longteng Guo, Tongtian Yue, Sihan Chen, Shuai Shao, Xinxin Zhu, Zehuan Yuan, Jing Liu.
arXiv preprint arXiv:2305.16103, 2023.
CrowdHuman: A Benchmark for Detecting Human in a Crowd.
Shuai Shao, Zijian Zhao, Boxun Li, Tete Xiao, Gang Yu, Xiangyu Zhang, Jian Sun.
arXiv preprint arXiv:1805.00123, 2018.
Object detection via end-to-end integration of aspect ratio and context aware part-based models and fully convolutional networks.
Bo Li, Tianfu Wu, Shuai Shao, Lun Zhang, Rufeng Chu.
arXiv preprint arXiv:1612.00534, 2016.
*indicates equal contribution.
Links
Research Collaborators:
Mr. Yuning Jiang Zeming Li (黎泽明) Lan-Zhe Guo (郭兰哲) Tianyuan Zhang (张天远) Enze Xie (谢恩泽) Xinlong Wang (王鑫龙) Changhu Wang (王长虎) Longteng Guo (郭龙腾) Changqian Yu (余昌黔) Bo Yuan (苑博) Yiping Bao (鲍一平) Feng Wang (王枫) Limeng Qiao (乔李盟) Junyi Chen (陈浚毅) Junshu Tang (唐俊姝) Ruihuang Li(李蕊煌)