Research Scientist
I develop computer vision and machine learning algorithms that unify disparate sensor signals—ranging from overhead imagery and ground-level views to ambient audio—fusing them to create richer models of our world. By leveraging geo-temporal context as a universal alignment key, my research builds the foundations for multimodal Geospatial AI.
We propose a new task, mixed-view panorama synthesis, in which a satellite image and a set of nearby panoramas are used to render a panorama at a novel location. Our approach uses diffusion-based modeling and attention to enable flexible, multimodal control.
This work addresses the task of modeling spatiotemporal traffic patterns directly from overhead imagery, which we refer to as image-driven traffic modeling. Our approach includes a geo-temporal positional encoding module for integrating geo-temporal context and a probabilistic objective function for estimating traffic speeds that naturally models temporal variations.
We introduce a novel architecture for near/remote sensing that is based on geospatial attention and demonstrate its use for five segmentation tasks.
My work spans a variety of areas, including view synthesis, multimodal fusion, pose estimation, and autonomous navigation. I am particularly interested in developing methods for combining visual data from multiple sensors and viewpoints with other modalities to create richer, more accurate models of the environment. Additionally, I explore innovative methods for leveraging image metadata—such as location and time (the geo-temporal context of an image)—to improve the performance and reliability of computer vision systems.
2025–Present
DZYNE Technologies · 2021–2025
DZYNE Technologies · 2019–2021
University of Kentucky · 2012–2018