Log inSign up
Shubham Tulsiani
164 posts
user avatar
Shubham Tulsiani
@shubhtuls
Assistant Professor in the Robotics Institute, Carnegie Mellon University I want to build perception systems that can understand the physical world
shubhtuls.github.io
Joined January 2011
179
Following
3,442
Followers
  • Pinned
    user avatar
    Shubham Tulsiani
    @shubhtuls
    Feb 27
    [1/N] Current visual geometry prediction models primarily rely on labeled 3D data. Our CVPR26 paper, Flow3r, allows additionally leveraging unlabeled videos (using flow supervision) for scalable visual geometry learning, enabling accurate multi-view 3D reconstruction in-the-wild.
    Image
    00:00
    16K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Feb 23, 2024
    [1/6] What representation comes to mind when you think of a ‘camera’? Perhaps an extrinsic + intrinsic matrix? In our ICLR (oral) paper, we instead infer a distributed representation where each pixel is associated with a ray, and show SoTA results for few-view pose estimation.
    Image
    00:00
    142K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Oct 2, 2025
    [1/N] We present a plug-and-play mechanism to controllably steer inference of any diffusion/flow model towards a sharper or flatter sampling distribution, resulting in improvements across domains e.g. text-to-image (10% FID reduction), protein generation (improved designability).
    Image
    00:00
    61K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Aug 23, 2022
    [1/4] Camera poses are essential for (neural) 3D reconstruction. But what about sparse-view settings where obtaining these via COLMAP isn’t feasible? Our ECCV paper tackles this using an energy-based formulation for predicting relative rotation (jasonyzhang.com/relpose)
    Image
    00:00
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    May 12, 2025
    [1/6] Our #CVPR2025 paper “DiffusionSfM” extends our RayDiffusion framework — inferring both geometry and cameras via diffusing pixelwise ray origins and endpoints.
    Image
    00:00
    41K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Dec 9, 2024
    [1/5] Recovering 3D from sparse-view input in-the-wild requires solving a chicken-and-egg problem between pose estimation and 3D reconstruction. Our NeurIPS paper, SparseAGS, presents a method to jointly solve these for high-fidelity 3D estimation.
    Image
    00:00
    11K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Apr 11, 2023
    [1/3] Excited to share our #CVPR23 paper with @zhizdev on 3D object reconstruction from as few as 2 views! Please see our website for results over 50 categories: sparsefusion.github.io
    Image
    00:00
    21K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Mar 31, 2022
    [1/5] 3D Generation tasks are inherently multimodal — generating a full shape from partial observation, or 3D from text, or even single-view prediction. Our CVPR paper shows that a common prior over the space of shapes allows multi-modal prediction across these different tasks.
    Image
    GIF
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Dec 7, 2024
    [1/5] Diffusion models can now generate images in a flash. Can we similarly have ultra-fast 3D generation? We present Turbo3D — a generative 3D model for high-quality text-to-3D generation in 0.35s!
    Image
    00:00
    11K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Jul 20, 2021
    (1/6) Excited to share our paper “PixelTransformer” that was presented at ICML today. It proposes a simple and unified framework for generating dense spatial signals (e.g images, shapes, polynomials) given just a few samples.
    Image
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Apr 26, 2024
    [1/7] We humans use our hands to interact with a myriad objects around us. Our upcoming CVPR paper G-HOP (judyye.github.io/ghop-www/) learns a 3D generative model for such interactions, and can synthesize, both, the hand and object in 3D given a category label.
    Image
    00:00
    15K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Apr 8, 2022
    [1/6] An elusive goal in single-view 3D prediction has been to scale beyond a handful of object categories. Our upcoming CVPR paper presents an extremely simple approach towards this, and allows learning a unified reconstruction model over 150 object categories.
    Image
    GIF
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Apr 19, 2025
    Excited to share this dataset with registered aerial and ground images with dense geometry and correspondence supervision. Please see Khiem’s thread for some cool applications this enables!
    user avatar
    Khiem Vuong
    @kvuongdev
    Apr 18, 2025
    [1/6] Recent models like DUSt3R generalize well across viewpoints, but performance drops on aerial-ground pairs. At #CVPR2025, we propose AerialMegaDepth (aerial-megadepth.github.io), a hybrid dataset combining mesh renderings with real ground images (MegaDepth) to bridge this gap.
    Image
    00:00
    7.1K
  • user avatar
    Shubham Tulsiani
    @shubhtuls
    Dec 12, 2023
    [1/3] Ever wanted to obtain 3D from just a couple of images? We present UpFusion, a system for 3D object reconstruction given a sparse set of unposed input images. Work led by @bharathrajn98, in collaboration with @hyjameslee, @SergeyTulyakov.
    Image
    00:00
    16K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement