I am a Staff Research Scientist at Google DeepMind (formerly Google Brain). I contribute to the Gemini evaluation and post-training. Previously, I was an AI resident at Google Brain. I received my PhD in Computational Biology and Bioinformatics, as well as MSc in Statistics, from the University of Southern California.
My research interests span three key areas: (1) reliable and trustworthy large language models, (2) uncertainty and robustness of deep learning, and (3) the development of reliable machine learning and statistical modeling methods for real-world applications, with a special interest in biological and medical studies. My long-term goal is to develop trustworthy AI solutions that can be safely deployed in real-world scenarios, helping to advance scientific discoveries and improve the well-being of humanity.
News
04/2025: I will talk on “Reliable AI Feedback” at ICLR 2025 Workshop of “Quantify Uncertainty and Hallucination in Foundation Models: The Next Frontier in Reliable AI”.
12/2023: Our study on selective generation for large language models, "Self-evaluation improves selective generation in large language models”, will be presented (spotlight) at NeurIPS 2023 I Can’t Believe It’s Not Better! (ICBINB) Workshop [paper].
10/2023: Two papers studying the uncertainty estimation in language generation models are accepted by ENNLP 2023: “Improving the Robustness of Summarization Models by Detecting and Removing Input Noise” [paper], and “On Uncertainty Calibration and Selective Generation in Probabilistic Neural Summarization: A Benchmark Study" [paper].
07/2023: Our work on prompt selection and prompt ensemble for multimodal models, "A simple zero-shot prompt weighting technique to improve prompt ensembling in text-image models", will be presented at ICML 2023. [paper] [twitter] [code].
06/2023: Our paper "Improving Zero-shot Generalization and Robustness of Multi-modal Models" is accepted by CVPR 2023. Congratulations to Yunhao Ge. [paper] [twitter] [code].
05/2023: Our paper "Out-of-Distribution Detection and Selective Generation for Conditional Language Models" is accepted (spotlight) by ICLR 2023. [paper] [twitter].
07/2022: Together with my colleagues at Google, we release a framework, Plex, for reliability in large pre-trained models. Please check out our latest blog post to learn more. [blog].
01/2022: Our new AI blog post is out :) It walks through our recent paper on enabling dermatology AI systems to detect the presence of conditinos unseen in training [blog].
12/2021: Our paper "Exploring the Limits of Out-of-Distribution Detection" [paper] will be presented at NeurIPS 2021.
07/2021: Our paper "A Simple Fix to Mahalanobis Distance for Improving Near-OOD Detection" [paper] [poster] will be presented at ICML 2021 UDL workshop.
07/2021: Our paper "Exploring the Limits of Out-of-Distribution Detection" [paper] will be presented at ICML 2021 UDL workshop.
06/2019: I am joining Google Brain as a Research Scientist.
06/2018: I am joining Google Brain AI Residency program.
03/2018: Our work is highlighted in Nature News titled "Machine learning spots treasure trove of elusive viruses Artificial intelligence could speed up metagenomic studies that look for species unknown to science.".
04/2017: I am honored to receive the prestigious USC PhD Achievement Award (6 winners university-wide per year).
03/2017: I am honored to receive the Women in Science and Engineering Travel Grant.
04/2016: I am honored to receive USC CAMS Graduate Student Prize for Excellence in Research with a Substantial Mathematical Component (2 winners university-wide per year).
Publications
Reliable and Trustworthy Large Language Models
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities [paper]
SETS: Leveraging Self-Verification and Self-Correction for Improved Test-Time Scaling.
Jiefeng Chen., Ren, J., Chen, X., Yang, C., Sun, R., & Arık, S. Ö.
E Kelly Buchanan, Michael W Dusenberry, Jie Ren, Kevin Patrick Murphy, Balaji Lakshminarayanan, Dustin Tran.
Presented at the NeurIPS Workshop on Distribution Shifts (2022). [paper]
Plex: Towards reliability using pretrained large model extensions.
Dustin Tran, Jeremiah Liu, Michael W Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim GJ Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan.
2016 Molecular and Computational Biology Retreat, University of Southern California: Predicting Virus-Host Interactions Using Sequence Signatures.
2015 Joint USC/UCLA Bioinformatics Colloquium: Statistical Inference of Markov Models of Genomic Sequences Using NGS Data.
2015 Research in Computational Molecular Biology (RECOMB)-Seq: Inference of Markovian Properties of Molecular Sequences From NGS Data and Applications to Comparative Genomics.
2013 Molecular and Computational Biology Retreat, University of Southern California: Alignment-Free Genome and Metagenome Comparison Based on Next Generation Sequencing Reads.
Education
Postdoc Fellow, University of Southern California, Los Angeles, USA. (2017-2018).
Ph.D. in Computational Biology and Bioinformatics, University of Southern California, Los Angeles, USA. (2013-2017).