Nima Shoghi

Nima Shoghi

ML PhD Student

Atlanta, GA

I'm a PhD student in Machine Learning at Georgia Tech, where I focus on foundation models for atomic-scale simulation and scientific discovery under the guidance of Dr. Pan Li and Dr. Victor Fung. Prior to starting my PhD, I completed a two-year AI residency at Meta AI's FAIR Chemistry team, where I developed JMP, a foundation model for atomic property prediction pre-trained on 120M+ structures that achieved state-of-the-art results on 34/40 downstream tasks (ICLR 2024). I earned my B.S. and M.S. in Computer Science from Georgia Tech, during which I researched ML training acceleration at the HPArch Lab. My current work spans generative models for molecular dynamics (SE(3)-equivariant diffusion for protein trajectory modeling), robust fine-tuning of molecular foundation models (MatterTune, NeurIPS 2025), and large-scale pre-training for chemistry. I am particularly excited about the potential for deep learning to accelerate discovery in chemistry, drug discovery, and materials science.

Recent Updates

Sep 2025

Our paper on robust fine-tuning for molecular graph foundation models was accepted to NeurIPS 2025!

May 2025

Started a Research Scientist Internship at ByteDance Research (ByteDance Seed), working on generative models for protein dynamics.

Apr 2025

Our paper on MatterTune, a platform for fine-tuning atomistic foundation models, was accepted to Digital Discovery!

Sep 2024

Gave invited talks on pre-training for chemistry at various venues, including an invited keynote at the 2024 Machine Learning for Materials and Molecular Discoveries Symposium in Gothenburg, the AI for Science Institute (AISI) Beijing, KAUST, and SES AI.

Aug 2024

Started my PhD in Machine Learning at Georgia Tech with Dr. Pan Li and Dr. Victor Fung.

Jun 2024

Started a machine learning internship at ProcessMiner.

May 2024

Wrote a blog post on From Molecules to Materials for Valence Labs.

Jan 2024

Our paper on large-scale diverse pre-training for chemical property prediction (JMP) was accepted to ICLR 2024!

Dec 2023

Joined the High Performance Computer Architecture Lab at Georgia Tech as a Temporary Research Staff.

Aug 2023

Gave a talk at the ACS Fall Meeting.

Featured Publications

Selected research publications

From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction

International Conference on Learning Representations (ICLR)2024Featured

Nima Shoghi, Adeesh Kolluru, John Kitchin, Zachary Ulissi, C. Lawrence Zitnick, and Brandon Wood

Introduces Joint Multi-domain Pre-training (JMP), a supervised pre-training strategy that leverages diverse data to advance atomic property prediction across chemical domains, achieving state-of-the-art performance on 34 out of 40 downstream tasks.

Richard Tran, Janice Lan, Muhammed Shuaibi, Brandon M. Wood, Siddharth Goyal, Abhishek Das, Javier Heras-Domingo, Adeesh Kolluru, Ammar Rizvi, Nima Shoghi, Anuroop Sriram, Félix Therrien, Jehad Abed, Oleksandr Voznyy, Edward H. Sargent, Zachary Ulissi, and C. Lawrence Zitnick

Introduces the Open Catalyst 2022 (OC22) dataset, consisting of 62,331 DFT relaxations, to accelerate machine learning for oxide electrocatalysts and establish benchmarks for the field.

Lingyu Kong, Nima Shoghi, Guoxiang Hu, Pan Li, and Victor Fung

Introduces MatterTune, a modular platform that enables fine-tuning of pre-trained atomistic foundation models for materials science applications.

Experience

Selected professional and research experience

ByteDance Seed

Research Scientist Intern, AI for Science

May 2025 — Present
  • Developed generative molecular dynamics models that achieve state-of-the-art on the ATLAS protein dynamics benchmark (~33% higher conformational coverage and ~60% higher structural quality vs. all prior methods).
  • Designed an SE(3)-equivariant diffusion architecture with causal transformers and joint spatiotemporal attention, enabling physically plausible trajectory generation over microsecond timescales.
  • Built end-to-end pipeline for generative atomic dynamics: distributed training, physics-based relaxation, quality/diversity evaluation.
  • Led the development of Joint Multi-domain Pre-training (JMP), a foundation model for atomic property prediction pre-trained on 120M+ structures from diverse chemical domains (small molecules, large molecules, and materials). Achieved state-of-the-art results on 34 out of 40 downstream tasks with a single pre-trained model. (ICLR 2024*)
  • Co-developed transfer learning methods using Graph Neural Networks to generalize models across molecular and catalyst domains, reducing the need for large domain-specific datasets. (J Chem Phys 2022)
  • Contributed to the Open Catalyst 2022 paper by running baseline model benchmarks for oxide electrocatalysts. (ACS Catal. 2023)
Graph Computation and Machine Learning Lab @ GT
Aug 2024 — Present
  • Researching robust fine-tuning strategies for large-scale pre-trained GNN models.
  • Co-developed MatterTune, an open-source platform for fine-tuning atomistic foundation models (UMA, JMP, EquiformerV2, MACE, etc.) with parameter-efficient methods; achieved near-SOTA on the MatBench Discovery benchmark. (Digital Discovery 2025)
  • Contributed to a benchmark study of 8 robust fine-tuning methods across 6 molecular graph foundation models on 12 downstream tasks, informing the design of improved fine-tuning strategies for molecular property prediction. (NeurIPS 2025)
  • Prior work at GT HPArch Lab: Developed efficient inference strategies for diffusion models (latent-space sampling, quantization) and memory-efficient training (SmaQ: 6.7x memory reduction). (IEEE CAL 2021*, MemSys 2020*)

Education

Georgia Institute of Technology

PhD in Machine Learning, 4.0 GPA

2024 — 2028 (expected)
Georgia Institute of Technology

M.S. in Computer Science (ML Focus), 4.0 GPA

2020 — 2021
Georgia Institute of Technology

B.S. in Computer Science (ML Focus)

2015 — 2019