Dheeraj Varghese

Building a path toward helpful intelligence.

prof_pic.jpg

I’m a PhD candidate at the VIS Lab, supervised by Cees Snoek. I work on generalist multimodal foundation models as part of the Horizon Europe ELLIOT project. My research focuses on designing unified architectures that combine modalities into a shared space, aiming for models that generalize well, adapt efficiently, and assist meaningfully across a wide range of tasks.

Previously, I worked on combining discrete diffusion and autoregression for multilingual image generation with Mohammad M. Derakhshani, and explored curriculum learning in vision-language models under the supervision of Yuki Asano.

At my core, I’m an applied engineer with an enthusiasm for recreating intelligence that serves as a tool, to make tasks easier for the human user. Sample efficiency in learning, blurring the context window, and unified representation spaces - all capture my attention at the moment.

news

Mar 28, 2026 Attended the ELLIS Winter School on Foundation Models.
Feb 28, 2026 Served as a reviewer for the ICLR 2026 Multimodal Intelligence Workshop.
Dec 07, 2025 Presented two posters at NeurIPS Europe 🇩🇰
Nov 19, 2025 Presented NeoBabel on Day 2 of the Cohere Connect Conference
Aug 19, 2025 Served as a reviewer for the ICCV 2025 LIMIT Workshop.

selected publications

  1. taxonomicGQA-title.png
    Vision-and-Language Training Helps Deploy Taxonomic Knowledge but Does Not Fundamentally Alter It
    Yulu Qin*, Dheeraj Varghese*, Adam Dahlgren Lindström, and 3 more authors
    NeurIPS, 2025
  2. neobabel_new_color.png
    NeoBabel: An Inclusive Multilingual Open Tower for Visual Generation
    Mohammad Mahdi Derakhshani*, Dheeraj Varghese*, Marzieh Fadaee, and 1 more author
    In EurIPS 2025 Workshop on Principles of Generative Modeling (PriGM) , 2025