News

2026.02.20
2026.01.28
2026.01.26
2025.11.08
2025.09.19
2025.08.22
2025.07.01
2025.06.28
2025.04.01
2025.02.10
2025.03.07
2025.02.04
2025.01.16
2025.01.01
2024.12.24
2024.11.15
2024.10.25
2024.10.15
2024.10.14
2024.09.15
2024.07.15
2024.07.07
2024.05.30
2024.02.15
2024.01.15
2023.09.15
2023.07.15
2023.06.15
2023.01.15
2022.11.15
2022.10.10
2022.04.22
2022.02.25
2021.10.18
2021.10.10
2020.07.03
2019.05.10
2019.09.10
2018.10.10
Image Shuo Wang 

Image

Lab for Data Science
School of Information Science and Technology
University of Science and Technology of China

Email: shuowangcv@ustc.edu.cn


Hello, I’m Shuo Wang! I am currently an Associate Research Fellow, School of Information Science and Technology, University of Science and Technology of China (USTC), China. My research interests primarily revolve around multimodal content analysis, model lightweighting, and multimodal large models. I focus on the development of efficient methods for processing and analyzing complex multimedia data, with applications in large-scale multimedia retrieval, data embedding, and advanced video understanding.

Selected Publications

Image
Paper
Hybrid Granularity Distribution Estimation for Few-Shot Learning: Statistics Transfer from Categories and Instances
Shuo Wang, Tianyu Qi, Xingyu Zhu, Yanbin Hao, Beier Zhu, Hanwang Zhang, Meng Wang
IEEE Transactions on Image Processing, 2026  
Image
Paper
Hierarchical Semantic Alignment for Image Clustering
Xingyu Zhu, Beier Zhu, Yunfan Li, Junfeng Fang, Shuo Wang*, Kesen Zhao, Hanwang Zhang
AAAI, 2026,   *Corresponding Author
Image
Paper
Enhancing CLIP Robustness via Cross-Modality Alignment
Xingyu Zhu, Beier Zhu, Shuo Wang*, Kesen Zhao, Hanwang Zhang
NeuralIPS, 2025,   *Corresponding Author
Image
Paper
Accelerating Diffusion Transformer via Error-Optimized Cache
Junxiang Qiu, Shuo Wang*, Jinda Lu, Lin Liu, Houcheng Jiang, Yanbin Hao
ACM MM, 2025,   *Corresponding Author
Image
Paper
Accelerating Diffusion Transformer via Gradient-Optimized Cache
Junxiang Qiu, Lin Liu, Shuo Wang*, Jinda Lu, Kezhou Chen, Yanbin Hao
ICCV, 2025,   *Corresponding Author
Image
Paper
Dynamic Multimodal Prototype Learning in Vision-Language Models
Xingyu Zhu, Shuo Wang*, Beier Zhu, Miaoge Li, Yunfan Li, Junfeng Fang, Zhicai Wang, Dongsheng Wang, Hanwang Zhang
ICCV, 2025,   *Corresponding Author
Image
Paper
Symmetric Hallucination with Knowledge Transfer for Few-shot Learning
Shuo Wang, Xinyu Zhang, Meng Wang, Xiangnan He
IEEE Transactions on Multimedia, 2024  
Image
Paper
Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting
Xingyu Zhu, Beier Zhu, Yi Tan, Shuo Wang*, Yanbin Hao, Hanwang Zhang
NeurIPS (Spotlight), 2024,   *Corresponding Author
Image
Paper
Selective Vision-Language Subspace Projection for Few-shot CLIP
Xingyu Zhu, Beier Zhu, Yi Tan, Shuo Wang*, Yanbin Hao, Hanwang Zhang
ACM MM, 2024,   *Corresponding Author
Image
Paper
Feature Mixture on Pre-Trained Model for Few-Shot Learning
Shuo Wang, Jinda Lu, Haiyang Xu, Yanbin Hao, Xiangnan He
IEEE Transactions on Image Processing, 2024  
Image
Paper
Boosting Few-Shot Learning via Attentive Feature Regularization
Xingyu Zhu, Shuo Wang*, Jinda Lu, Yanbin Hao, Haifeng Liu, Xiangnan He
AAAI, 2024,   *Corresponding Author
Image
Paper
Semantic-based Selection, Synthesis, and Supervision for Few-shot Learning
Jinda Lu, Shuo Wang*, Xinyu Zhang, Yanbin Hao, Xiangnan He*
ACM MM, 2023,   *Corresponding Author
Image
Paper
Spatio-Temporal Collaborative Module for Efficient Action Recognition
Yanbin Hao, Shuo Wang*, Yi Tan, Xiangnan He, Zhenguang Liu, Meng Wang
IEEE Transactions on Image Processing, 2022,   *Corresponding Author
Image
Paper
Multi-directional Knowledge Transfer for Few-shot Learning
Shuo Wang, Xinyu Zhang, Yanbin Hao, Chengbing Wang, Xiangnan He
ACM MM, 2022  
Image
Paper
Attention in Attention: Modeling Context Correlation for Efficient Video Classification
Yanbin Hao, Shuo Wang*, Pei Cao, Xinjian Gao, Tong Xu, Jinmeng Wu, Xiangnan He
IEEE Transactions on Circuits and Systems for Video Technology, 2022,   *Corresponding Author
Image
Paper
Large-scale Few-shot Learning via Multi-modal Knowledge Discovery
Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian, Meng Wang
ECCV, 2020  
Image
Paper
Dense Temporal Convolution Network for Sign Language Translation
Dan Guo, Shuo Wang, Qi Tian, Meng Wang
IJCAI, 2019  
Image
Paper
Connectionist Temporal Fusion for Sign Language Translation
Shuo Wang, Dan Guo, Wengang Zhou, Zhengjun Zha, Meng Wang
ACM MM, 2018  
Image
Paper
Method and Apparatus for Training Classifier
Shuo Wang, Jun Yue, Jianzhuang Liu, Qi Tian
US Patent App. 17/892,908, 2023  

Other Publications (Full list see Google Scholar)

Image
Paper
Look Carefully: Adaptive Visual Reinforcements in Multimodal Large Language Models for Hallucination Mitigation
Xingyu Zhu, Kesen Zhao, Liang Yi, Shuo Wang, Zhicai Wang, Beier Zhu, Hanwang Zhang, Xiangnan He
ICLR, 2026  
Image
Paper
GuardAlign: Robust Safety Alignment in Multimodal Large Language Models
Xingyu Zhu, Beier Zhu, Junfeng Fang, Shuo Wang, Yin Zhang, Xiang Wang, Xiangnan He
ICLR, 2026  
Image
Paper
Accelerating Controllable Generation via Hybrid-grained Cache
Lin Liu, Huixia Ben, Shuo Wang, Jinda Lu, Junxiang Qiu, Shengeng Tang, Yanbin Hao
AAAI, 2026  
Image
Paper
Cross-Modal Feature Enhancement and Contrastive Alignment for Micro-Gesture Recognition
Tuyun Shang, Yanbin Hao, Ming Pei, Kun Li, Huixia Ben, Shuo Wang
The 8th Chinese Conference on Pattern Recognition and Computer Vision, 2025  
Image
Paper
Interventional Feature Generation for Few-shot Learning
Shuo Wang, Jinda Lu, Huixia Ben, Yanbin Hao, Xingyu Gao, Meng Wang
ACM Transactions on Multimedia Computing, Communications and Applications 2025, 2025  
Image
Paper
Multimodal Generation with Consistency Transferring
Junxiang Qiu, Jinda Lu, Shuo Wang
Findings of the Association for Computational Linguistics: NAACL 2025, 2025  
Image
Paper
Mixture of Multimodal Adapters for Sentiment Analysis
Kezhou Chen, Huixia Ben, Shuo Wang, Shengeng Tang, Yanbin Hao
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2025  
Image
Paper
Linguistics-Vision Monotonic Consistent Network for Sign Language Production
Xu Wang, Shengeng Tang, Peipei Song, Shuo Wang, Dan Guo, Richang Hong
International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2025  
Image
Paper
Gloss-Driven Conditional Diffusion Models for Sign Language Production
Shengeng Tang, Feng Xue, Jingjing Wu, Shuo Wang, Richang Hong
ACM Transactions on Multimedia Computing, Communications and Applications, 2025  
Image
Paper
DAMO: Data-and Model-aware Alignment of Multi-modal LLMs
Jinda Lu, Junkang Wu, Jinghan Li, Xiaojun Jia, Shuo Wang, YiFan Zhang, Junfeng Fang, Xiang Wang, Xiangnan He
ICML 2025, 2025  
Image
Paper
Video Corpus Moment Retrieval with Query-specific Context Learning and Progressive Localization
Long Zhang, Peipei Song, Zhangling Duan, Shuo Wang, Xiaojun Chang, Xun Yang
TCSVT 2025, 2025  
Image
Paper
CVLP-NaVD: Contrastive Visual-Language Pre-training Models for Non-annotated Visual Description
Haoran Li, Yanbin Hao, Jiarui Yu, Bin Zhu, Shuo Wang, Tong Xu
ACM Transactions on Multimedia Computing, Communications and Applications, 2024  
Image
Paper
Pseudo Content Hallucination for Unpaired Image Captioning
Huixia Ben, Shuo Wang*, Meng Wang, Richang Hong
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024,   *Corresponding Author
Image
Paper
Enhancing Recipe Retrieval with Foundation Models: A Data Augmentation Perspective
Fangzhou Song, Bin Zhu, Yanbin Hao, Shuo Wang
European Conference on Computer Vision, 2024  
Image
Paper
JPA: A Joint-Part Attention for Mitigating Overfocusing on 3D Human Pose Estimation
Dengqing Yang, Zhenhua Tang, Jinmeng Wu, Shuo Wang, Lechao Cheng, Yanbin Hao
Chinese Conference on Pattern Recognition and Computer Vision (PRCV), 2024  
Image
Paper
GLCM-Adapter: Global-Local Content Matching for Few-shot CLIP Adaptation
Shuo Wang, Enlong Xie, Jinda Lu, Jinghan Li, Yanbin Hao
Proceedings of the 35th British Machine Vision Conference (BMVC), 2024  
Image
Paper
Hierarchical Supervised Contrastive Learning for Multimodal Sentiment Analysis
Kezhou Chen, Shuo Wang*, Yanbin Hao
International Conference on Multimedia Modeling, 2024,   *Corresponding Author
Image
Paper
Boosting Hyperspectral Image Classification with Dual Hierarchical Learning
Shuo Wang, Huixia Ben, Yanbin Hao, Xiangnan He, Meng Wang
ACM Transactions on Multimedia Computing, Communications and Applications, 2023  
Image
Paper
Space-time Separate Modeling for Efficient Video Classification
Pei Cao, Shuo Wang*, Jinmeng Wu, Yanbin Hao
Journal of Physics: Conference Series, 2021,   *Corresponding Author
Image
Paper
Cross-modality Retrieval by Joint Correlation Learning
Shuo Wang, Dan Guo, Xin Xu, Li Zhuo, Meng Wang
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 2019  

Grants

2026 在研 国家自然科学基金面上项目 项目主持 2026.01-2029.12
2025 在研 安徽省自然科学基金面上项目 项目主持 2025.09-2028.09
2024 在研 长三角科技创新共同体联合攻关项目 课题兼子课题主持 2024.12-2027.11
2023 结题 国家自然科学基金青年科学基金项目 (C类) 项目主持 2023.01-2024.12
2021 结题 安徽高校协同创新项目 联合牵头兼课题主持 2021.08-2023.08
2020 结题 JKW 国防科技创新项目 课题主持 2020.12-2023.08

Professional Services

  • 中国计算机学会(CCF)多媒体技术专业委员会执行委员
  • 中国中⽂信息学会社会媒体处理专委会委员
  • Guest Editor of Electronics (Special Issue: Artificial Intelligence for Smart Image Perception, Recognition and Understanding)

Education and Experiences

University of Science and Technology of China (USTC)
Postdoc Research Fellow      Mar. 2021 - Mar. 2023, Hefei, Anhui, China
Advisor: Prof. Xiangnan He
Hefei University of Technology (HFUT)
Ph.D. Student of Signal and Information Processing      Sep. 2015 - Jan. 2021, Hefei, Anhui, China
Advisor: Prof. Meng Wang
Hefei University of Technology (HFUT)
Bachelor"s Degree in Electronics Engineering      Sep. 2011 - Jun. 2015, Hefei, Anhui, China
Advisor: Prof. Meng Wang

Last update: 28 Jan. 2026. The webpage template borrows from Xiangnan He.