Research Scientist

Scott Workman

I develop computer vision and machine learning algorithms that unify disparate sensor signals—ranging from overhead imagery and ground-level views to ambient audio—fusing them to create richer models of our world. By leveraging geo-temporal context as a universal alignment key, my research builds the foundations for multimodal Geospatial AI.

View Research CV Scholar

Computer Vision Machine Learning Geospatial AI Generative AI 3D Vision Multimodal Fusion Remote Sensing

Spotlight Research

Mixed-View Panorama Synthesis using Geospatially Guided Diffusion

Mixed-View Panorama Synthesis using Geospatially Guided Diffusion

We propose a new task, mixed-view panorama synthesis, in which a satellite image and a set of nearby panoramas are used to render a panorama at a novel location. Our approach uses diffusion-based modeling and attention to enable flexible, multimodal control.

Probabilistic Image-Driven Traffic Modeling via Remote Sensing

Probabilistic Image-Driven Traffic Modeling via Remote Sensing

This work addresses the task of modeling spatiotemporal traffic patterns directly from overhead imagery, which we refer to as image-driven traffic modeling. Our approach includes a geo-temporal positional encoding module for integrating geo-temporal context and a probabilistic objective function for estimating traffic speeds that naturally models temporal variations.

Revisiting Near/Remote Sensing with Geospatial Attention

Revisiting Near/Remote Sensing with Geospatial Attention

We introduce a novel architecture for near/remote sensing that is based on geospatial attention and demonstrate its use for five segmentation tasks.

Research

My work spans a variety of areas, including view synthesis, multimodal fusion, pose estimation, and autonomous navigation. I am particularly interested in developing methods for combining visual data from multiple sensors and viewpoints with other modalities to create richer, more accurate models of the environment. Additionally, I explore innovative methods for leveraging image metadata—such as location and time (the geo-temporal context of an image)—to improve the performance and reliability of computer vision systems.

Experience

Independent Research Scientist

2025–Present

Principal Research Scientist

DZYNE Technologies · 2021–2025

Senior Research Scientist

DZYNE Technologies · 2019–2021

Ph.D. Computer Science

University of Kentucky · 2012–2018

Recognition

Lead Area Chair, CVPR 2026
Outstanding Reviewer: CVPR, ICCV, ECCV, NeurIPS
Oral Presentation, CVPR 2020
Young Researcher, Heidelberg Laureate Forum

Recent News

area chair for NeurIPS 2026

area chair for ECCV 2026

area chair for ICML 2026

lead area chair for CVPR 2026

area chair for WACV 2026

journal article accepted to Transactions on Machine Learning Research

area chair for NeurIPS 2025

area chair for ICML 2025

area chair for ICCV 2025

presented paper at ECCV (Milan, Italy)

area chair for CVPR 2025

paper accepted to ECCV 2024

area chair for WACV 2025

area chair for NeurIPS 2024

recognized as an outstanding reviewer for NeurIPS 2023

View full archive