07/2025: DuoLoRA: Cycle-consistent and Rank-disentangled Content-Style Personalization accepted at ICCV2025!
02/2025: HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories accepted at CVPR2025!
02/2025: Distilling Multi-modal Large Language Models for Autonomous Driving accepted at CVPR2025!
11/2024: Recognized as a Top Reviewer at NeurIPS 2024!
02/2024: Prompting Hard or Hardly Prompting: Prompt Inversion for Text-to-Image Diffusion Models accepted at CVPR 2024.
02/2024: ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models accepted at CVPR 2024.
02/2024: Unsupervised Keypoints from Pretrained Diffusion Models accepted at CVPR 2024.
11/2023: Recognized as a Top Reviewer at NeurIPS 2023!
09/2023: Talk on Multimodal Representation Learning with Deep Generative Models at NTU Singapore.
09/2023: Unsupervised Semantic Correspondence Using Stable Diffusion accepted at NeurIPS 2023.
08/2023: Nominated for the GI-Dissertationspreis!
07/2023: Awarded research grant by the Vector Institute of Artificial Intelligence!
06/2023: Invited to the doctoral consortium at CVPR 2023!
06/2023: Nominated for the Bertha Benz Best Thesis Award in Germany!
02/2023: Make-A-Story: Visual Memory Conditioned Consistent Story Generation accepted at CVPR 2023.
10/2022: I have joined UBC as a postdoctoral researcher.
06/2022: I have successfully defended my Ph.D. dissertation (summa cum laude)!
Hiring
I am hiring! I have open positions for Postdoctoral Researchers (to start in June), PhD students, and Master's students. If you are interested in working on computer vision and machine learning, please reach out by email with your CV and a brief description of your research interests.
Research
I am interested in computer vision and machine learning, specifically in deep generative models (diffusion models, normalizing flows, variational methods, GANs) for multimodal representation learning.
Shweta Mahajan, Tanzila Rahman, Kwang Moo Yi, Leonid Sigal
CVPR, 2024
paper /
arxiv
Inverting the diffusion model to obtain interpretable language prompts directly based on the findings that different timesteps of the diffusion process cater to different levels of detail in an image.
Tanzila Rahman, Hsin-Ying Lee, Jian Ren, Sergey Tulyakov, Shweta Mahajan and Leonid Sigal
CVPR, 2023
paper /
arxiv /
Sentence-conditioned soft attention over the memories enables effective reference resolution and learns to maintain scene and actor consistency
when needed.
A block-autoregressive exact inference model employing a lossless pyramid decomposition with scale-specific representations to encode the joint distribution of image pixels.
We improve the representational power of flow-based models by introducing channel-wise dependencies in their latent space through multi-scale autoregressive priors.
Shweta Mahajan, Iryna Gurevych and Stefan Roth
ICLR, 2020 (Best Paper Award, Fraunhofer IGD)
paper /
arxiv /
video /
code
Our model integrates normalizing flow-based priors for the domain-specific information, which allows us to learn diverse many-to-many mappings between the image and text domains.