Zhenghao Zhao

Bio

My Photo

I am currently a Ph.D. candidate in Computer Science at the University of Illinois Chicago (UIC) guided by Prof. Yan Yan.

My research focuses on data-centric machine learning, with interests in dataset distillation, synthetic data generation, and data selection for LLMs. My work has been published in top-tier conferences such as CVPR, ICCV, ECCV, and NeurIPS, covering topics like dataset distillation for image and multimodal datasets, synthetic dataset generation, and algorithm for data selection.

I interned at Argonne National Laboratory in 2023 and 2025. In 2023, I studied the performance of distributed training frameworks such as PyTorch DDP, Horovod, and DeepSpeed. In 2025, I returned to Argonne to work on LLM training on high-performance computing (HPC) platforms, where I conducted a comparative study of DeepSpeed, TorchTitan, and FSDP for scalable optimization.

Before Ph.D., I received my M.S. in Computer Science from the Illinois Institute of Technology (IIT) and a B.S. in Computer Science and Engineering from Nanjing University of Post and Telecommunications (NJUPT).

Email / CV / Google Scholar / LinkedIn / GitHub

News

2026-02

Our paper, "Consistent Instance Field for Dynamic Scene Understanding" accepted to CVPR 2026!

2025-12

First-author paper, "Distill Video Datasets into Images" is now available on arXiv.

2025-09

First-author paper, "Efficient Multimodal Dataset Distillation via Generative Models" accepted to NeurIPS 2025!

2025-06

Serving as a Local Chair at ICMR 2025!

2025-06

Invited to give Lightning Talk at MMLS 2025!

2025-05

Begin my internship at Argonne National Laboratory!

2025-02

First-author paper, "Distilling Long-tailed Datasets" accepted to CVPR 2025!

2024-07

First-author paper, "Dataset quantization with active learning based adaptive sampling" accepted to ECCV 2024!

2024-06

First-author paper, "Audio-Visual Navigation with Anti-Backtracking" accepted to ICPR 2024!

2024-04

First-author paper, "Monocular Expressive 3D Human Reconstruction of Multiple People" accepted to ICMR 2024 oral!

2024-02

First-author paper, "Gated Multi-Scale Attention Transformer For Few-Shot Medical Image Segmentation" accepted to ISBI 2024!

2023-12

First-author paper, "Supplementing Missing Visions via Dialog for Scene Graph Generations" accepted to ICASSP 2024!

2023-12

First-author paper, "Machine learning enabled multiplex detection of periodontal pathogens by surface-enhanced Raman spectroscopy" accepted to the journal International Journal of Biological Macromolecules!

Work Experience

Amazon · Applied Scientist Intern

May 2026 - Aug. 2026, Santa Cruz, California, United States

Argonne National Laboratory · Research Intern

Research on LLM training on HPC platforms.

May 2025 - Aug. 2025, Lemont, Illinois, United States

Argonne National Laboratory · Research Intern

Research on distributed training frameworks.

May 2023 - Aug. 2023, Lemont, Illinois, United States

Publications

NeurIPS 2025

Distill Video Datasets into Images

Zhenghao Zhao, Haoxuan Wang, Kai Wang, Yuzhang Shang, Yuan Hong, Yan Yan

[PDF]

NeurIPS 2025

Efficient Multimodal Dataset Distillation via Generative Models

Zhenghao Zhao, Haoxuan Wang, Junyi Wu, Yuzhang Shang, Gaowen Liu, Yan Yan

Neural Information Processing Systems (NeurIPS), 2025

[PDF] [Code]

CVPR 2025

Distilling Long-tailed Datasets

Zhenghao Zhao*, Haoxuan Wang*, Yuzhang Shang, Kai Wang, Yan Yan

Computer Vision and Pattern Recognition (CVPR), 2025

[PDF] [Code]

ECCV 2024

Dataset Quantization with Active Learning based Adaptive Sampling

Zhenghao Zhao, Yuzhang Shang, Junyi Wu, Yan Yan

European Conference on Computer Vision (ECCV), 2024

[PDF]

ICMR 2024

Monocular Expressive 3D Human Reconstruction of Multiple People

Zhenghao Zhao, Hao Tang, Joy Wan, Yan Yan

International Conference on Multimedia Retrieval (ICMR), 2024

[PDF]

ICPR 2024

Audio-Visual Navigation with Anti-Backtracking

Zhenghao Zhao, Hao Tang, Yan Yan

International Conference on Pattern Recognition (ICPR), 2024

[PDF]

ICASSP 2024

Supplementing Missing Visions via Dialog for Scene Graph Generations

Zhenghao Zhao*, Ye Zhu*, Xiaoguang Zhu, Yuzhang Shang, Yan Yan

International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2024

[PDF]

ISBI 2024

Gated Multi-Scale Attention Transformer For Few-Shot Medical Image Segmentation

Zhenghao Zhao*, Hao Ding*, Dawen Cai, Yan Yan

IEEE International Symposium on Biomedical Imaging (ISBI), 2024

[PDF]

IJBM 2023

Machine learning enabled multiplex detection of periodontal pathogens by surface-enhanced Raman spectroscopy

Rathnayake AC Rathnayake*, Zhenghao Zhao*, Nathan McLaughlin, Wei Li, Yan Yan, Liaohai L Chen, Qian Xie, Christine D Wu, Mathew T Mathew, Rong R Wang

International Journal of Biological Macromolecules, 2024

[PDF]