I am very interested in data curation methods to improve multimodal pretraining. During my PhD, I have primarily worked on characterizing and constructing better pretraining datasets for multimodal models.
I previously interned in Google (DeepMind) and Apple, working on data-curation for multimodal models.
I'm currently on the industry job market! Please reach out if you have any open positions in pretraining.
We studied three questions fundamental to speech-language pretraining data curation and sampling; our controlled experiments yielded a performant 3.8B SpeechLM that outperforms 3x-larger SpeechLMs.
Our work showcases that the impressive empirical performance of multimodal models like CLIP and Stable Diffusion can be largely attributed to the presence of test concepts within their vast pretraining datasets, thus their reported empirical performance does not constitute "zero-shot" generalization. Quite the contrary, these models require exponentially more data on a concept to linearly improve their performance on tasks pertaining to that concept, highlighting extreme sample inefficiency.
Our work introduces the concept of lifelong benchmarks, enabling effective comparisons of models and reducing overfitting to the biases of a particular dataset. We constructed large-scale lifelong classification benchmarks totalling over 1.5M samples. To facilitate more efficient evaluation, we introduce the Sort&Search method that reduces inference compute costs by 1000x.
We enhance CLIP's downstream classification performance by (1) curating a support set either by generating synthetic (Stable Diffusion) or retrieving natural (LAION-5B) samples, and (2) observing and fixing a mis-calibration issue with intra-modal distances in CLIP’s embedding space.
Teaching
Deep Learning (CSE641)
Worked as a Teaching Assistant for the Deep Learning course offered by Dr. Saket Anand in Spring 2020.
Machine Learning (CSE543)
Worked as a Teaching Assistant for the Machine Learning course offered by Dr. Jainendra Shukla in Fall 2019.
Introduction to Engineering Design (DES130)
Worked as a Teaching Assistant for the Introduction to Engineering Design course offered by Dr. Aman Parnami in Spring 2019.
Linear Algebra (MTH100)
Worked as a Teaching Assistant for the Linear Algebra course offered by Dr. Samaresh Chatterjee in Fall 2018.
Misc
Apart from my academic interests, I am a huge football fan and actively support FC BarcelonaParis Saint Germain Inter Miami CF. You've probably guessed already, Lionel Messi is my favourite player to ever touch a football. I also love watching Formula 1 and look up to Lewis Hamilton. I used to write stuff, but that was a long long time ago. I also dabble around with the guitar and the keyboard at times. Checkout my soundcloud profile!