I am a senior research scientist at Google, in the Semantic Perception team of Federico Tombari.
I received my PhD in 2024, from ETH Zurich in
the Computer Vision Lab, supervised by
Prof. Luc Van Gool
and Dr. Martin Danelljan.
My PhD was focused on Computer Vision and its applications, especially in the tasks of image matching, 3D reconstruction, pose estimation and novel-view rendering.
Prior to that, I obtained a Master’s degree in Mechanical Engineering with honors at ETH Zurich. I also conducted an internship at RetinAI focused on computer vision applied to medical images.
08/2020 We just released a new pre-print GOCor. Code will be released soon here!
My research
VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
Hyojun Go,
Dominik Narnhofer,
Goutam Bhat,
Prune Truong,
Federico Tombari,
Konrad Schindler
Arxiv 2025 citation |
website |
paper |
code
Meet VIST3A — Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator,
a new framework that connects the best of two worlds: 🎥 Video diffusion models for rich latent visual
generation, and 🌍 Feed-forward 3D models (like VGGT, AnySplat, or MVDUSt3R) for geometric reconstruction.
AnyUp: Universal Feature Upsampling
Thomas Wimmer,
Prune Truong,
Marie-Julie Rakotosaona,
Michael Oechsle,
Federico Tombari,
Bernt Schiele,
Jan Eric Lenssen
Arxiv 2025 citation |
website |
paper |
code
We introduce AnyUp, a method for feature upsampling that can be applied to any vision feature at
any resolution, without encoder-specific training.
M2SVid: End-to-End Inpainting and Refinement for Monocular-to-Stereo Video Conversion
Nina Shvetsova,
Goutam Bhat,
Prune Truong,
Hilde Kuehne,
Federico Tombari
3DV 2026 citation |
paper
We tackle the problem of monocular-to-stereo video conversion and propose a novel architecture for
inpainting and refinement of the warped right view obtained by depth-based reprojection of the input left view.
One2Any: One-Reference 6D Pose Estimation for Any Object
Mengya Liu,
Siyuan Li,
Ajad Chhatkuli,
Prune Truong,
Luc Van Gool,
Federico Tombari
CVPR 2025 citation |
paper |
code
We propose a novel method, One2Any, that estimates the relative 6-degrees of freedom (DOF) object pose using
only a single reference-single query RGB-D image, without prior knowledge of its 3D model, multi-view data,
or category constraints.
SPARF: Neural Radiance Fields from Sparse and Noisy Poses Prune Truong,
Marie-Julie Rakotosaona,
Fabian Manhardt,
Federico Tombari
CVPR 2023 - Highlight (top 2.5%) citation |
paper |
project page |
video (8 min) |
teaser video |
poster |
code
We propose SPARF, a joint pose-NeRF refinement approach, applicable to extreme scenarios with only 2/3
input views and noisy camera poses. SPARF is the only method producing realistic novel-view synthesis
from as few as 2 input images with noisy poses.
Refign: Align and Refine for Adaptation of Semantic Segmentation to Adverse Conditions
David Bruggemann,
Christos Sakaridis,
Prune Truong,
Luc Van Gool
WACV 2023 citation |
paper |
code
We propose Refign, a generic extension to self-training-based domain adaptation methods for semantic segmentation.
PDC-Net+: Enhanced Probabilistic Dense Correspondence Network Prune Truong,
Martin Danelljan,
Radu Timofte,
Luc Van Gool
TPAMI 2023 citation |
paper |
project page |
code
We propose an approach for estimating dense correspondences between two views along with a confidence map. We extend
PDC-Net with new applications to image-based localization, 3D reconstructions and texture-transfer.
Probabilistic Warp Consistency for Weakly-Supervised Semantic Correspondences Prune Truong,
Martin Danelljan,
Fisher Yu,
Luc Van Gool
CVPR 2022 citation |
paper |
teaser video |
poster |
code
We propose a weakly-supervised training strategy for learning semantic correspondences. We introduce a
weakly-supervised training objective applied to probabilistic mapping as well as an approach to model
and identify occluded and unmatchable regions.
Warp Consistency for Unsupervised Learning of Dense Correspondences Prune Truong,
Martin Danelljan,
Fisher Yu,
Luc Van Gool
ICCV 2021 - Oral (top 3.0%) citation |
paper |
teaser video |
poster |
code
We propose an unsupervised training objective for learning to regress the dense correspondences relating a pair
of images. Our loss leverages real image pairs without invoking the photometric consistency assumption.
Unlike previous approaches, it is capable of handling large appearance and view-point changes.
Learning Accurate Dense Correspondences and When to Trust Them Prune Truong,
Martin Danelljan,
Luc Van Gool,
Radu Timofte
CVPR 2021 - Oral (top 4.0%) citation |
paper |
project page |
teaser video |
poster |
slides |
code
We develop a flexible probabilistic approach that jointly learns the dense correspondence prediction and
its uncertainty. We parametrize the predictive distribution as a constrained mixture model and
develop an architecture and training strategy tailored for robust and generalizable uncertainty
prediction in the context of self-supervised training.
GOCor: Bringing Globally Optimized Correspondence Volumes into Your Neural Network Prune Truong,
Martin Danelljan,
Luc Van Gool,
Radu Timofte
NeurIPS 2020 citation |
paper |
teaser video |
CV Talks video |
code
We propose GOCOr, a fully differentiable dense matching module, acting as a direct replacement to the
feature correlation layer. The correspondence volume generated by our module is the result of an
internal optimization procedure that explicitly accounts for - and suppressed - similar regions in the scene.
GLU-Net: Global-Local Universal Network for dense flow and correspondences Prune Truong,
Martin Danelljan,
Luc Van Gool,
Radu Timofte
CVPR 2020 - Oral (top 5.7%) citation |
paper |
teaser video |
slides |
poster |
oral video |
code
We propose GLU-Net, a unified architecture to estimate dense correspondences between any image pair, i.e. different
views of the same scene, consecutive frames of a video or even different instances of the same object category.
GLAMpoints: Greedily Learned Accurate Match points Prune Truong,
Stefanos Apostolopoulos,
Agata Mosinska,
Samuel Stucky,
Carlos Ciller,
Sandro De Zanet
ICCV 2019 citation |
paper |
poster |
code
We propose a training strategy for keypoint detection, applicable to low-quality and textureless images,
frequent in the medical domain. We learn keypoint detection by training directly for the final matching accuracy
instead of indirect metrics such as repeatability.
Invited Talks
2023:Dense Matching and Its Applications, Invited talk for the Swedish WASP program, Zurich.
2022:Dense Matching, Invited talk at Google, Semantic Perception, Zurich.
2022:Dense Matching, Invited talk at Microsoft, Mixed Reality and AI Lab, Zurich.