I am a Senior Scientist in Machine Learning in the Data AI and Genomics Group at Merck Research Laboratories.
My research focuses on pretraining large foundation models, optimizing training through efficient kernels, and evaluating model performance. My broader research interests include foundation models, deep learning, machine learning systems, training optimization, and explainable AI.
Better Models, Faster Training: Sigmoid Attention for single-cell Foundation Models Vijay Sadashivaiah,
Georgios Dasoulas,
Judith Mueller,
Soumya Ghosh
arXiv preprint, 2026
paper /
code
We demonstrate that replacing softmax with sigmoid attention in biological foundation models yields 25% improved cell-type separation, up to 10% faster training, and improved stability. We develop TritonSigmoid, an efficient GPU kernel achieving 515 TFLOPS on H100s.
TEDDY: A Family of Foundation Models for Understanding Single Cell Biology
Alexis Chevalier,
Soumya Ghosh,
Urvi Awasthi,
James Watkins,
Julia Bieniewska,
Nichita Mitrea,
Olga Kotova,
Kirill Shkura,
Andrew Noble,
Michael Steinbaugh,
Vijay Sadashivaiah,
George Dasoulas,
Julien Delile,
Christoph Meier,
Leonid Zhukov,
Iya Khalil,
Srayanta Mukherjee,
Judith Mueller
GenBio Workshop at ICML, 2025
paper
We present TEDDY, a family of transformer-based foundation models for single-cell biology, scaled to 116 million cells. The models (70M, 160M, and 400M parameters) can identify disease states and distinguish diseased from healthy cells, with performance improving predictably with data volume and parameter count.
We propose to explain chest X-ray pathology models using textual concepts. This is achieved by leveraging the joint latent space of image and text in vision-language models.
We evaluated the influence of bias on multimodal text generation models. In particular, we studied the impact of visual augmentation using state-of-the-art diffusion models when generating text.
We propose to suppress user-determined semantically meaningful concepts (viz. eyeglasses, smiling) from intermediate representations in computer vision tasks.
We introduce multi-armed bandit based representation routing to improve transfer learning in computer vision tasks.
Improving Language Model Predictions via Prompts Enriched with Knowledge Graphs Ryan Brate,
Minh-Hoang Dang,
Fabian Hoppe,
Yuan He,
Albert Meroño-Peñuelar,
Vijay Sadashivaiah
Deep Learning for Knowledge Graphs Workshop at ISWC, 2022
paper
We propose to imporve language model predictions by enriching the prompts from knowledge graphs.
A single-nucleus RNA-sequencing resource of 70,615 high-quality nuclei to generate a molecular taxonomy of cell types across five human brain regions.
KCNH2-3.1 mediates aberrant complement activation and impaired hippocampal-medial prefrontal circuitry associated with working memory deficits
Ming Ren,
Zhonghua Hu,
Dr. Qiang Chen,
Andrew E Jaffe,
Yingbo Li,
Vijay Sadashivaiah,
Shujuan Zhu,
Nina Rajpurohit,
Joo Heon Shin, Wei Xia,
Yankai Jia,
Jingxian Wu,
Sunny Lang Qin,
Xinjian Li,
Jian Zhu,
Qingjun Tian,
Daniel Paredes,
Fengyu Zhang,
Kuan Hong Wang,
Venkata S Mattay,
Joseph H Callicott,
Karen F Berman,
Daniel R Weinberger,
Feng Yang
Molecular Psychiatry 2019
paper