About:   I'm a Research Scientist working with the Nextcam team at Adobe on computational photography applications. I received my Ph.D. from Princeton University, where I was part of the Princeton Computational Imaging Lab advised by Professor Felix Heide, and was supported by the NSF Graduate Research Fellowship. I earned my bachelor's degree in electrical engineering and computer science from UC Berkeley.
I'm interested in computational photography and inverse problems that look at the
whole imaging pipeline, from signal collection to scene reconstruction. MRIs, modulated light
sources, or mobile phones; I love working with real devices and real data.
Over the course of my research I've written a number of open-source data collection apps:
Pani (Android, camera2) : An all-in-one
camera app for continuous recording of Bayer RAWs, accelerometer values, gyroscope measurements,
and a metric ton of device metadata from multiple camera configurations (main, ultrawide,
telephoto). I am actively using this app in my current work, and so plan to continue expanding
its features over time.
SoaP-App (iOS,
AVFoundation) : A "long-burst" capture app for recording up to 42 frame sequences of Bayer
RAWs, depth maps, accelerometer values, gyroscope measurements, and metadata.
HNDR-App (iOS, ARKit) : A "long-burst" capture app for recording up to 120 frame sequences
of processed RGB images, depth maps, and pose estimates (from ARKit world tracking).
"If you try and take a cat apart to see how it works, the first thing you have on your hands is a non-working cat."
- Douglas Adams
Neural Atlas Graphs are a hybrid 2.5D representation for high-resolution, editable dynamic scenes. They model a scene as a graph of moving planes in 3D, each equipped with a view-dependent neural atlas. This structure supports both 2D appearance editing and 3D re-ordering of scene elements, enabling rendering counterfactual scenarios with new backgrounds and modified object appearance.
Neural Atlas Graphs are a hybrid 2.5D representation for high-resolution, editable dynamic scenes. They model a scene as a graph of moving planes in 3D, each equipped with a view-dependent neural atlas. This structure supports both 2D appearance editing and 3D re-ordering of scene elements, enabling rendering counterfactual scenarios with new backgrounds and modified object appearance.
We design a spherical neural light field model for implicit panoramic image stitching and re-rendering, capable of handling depth parallax, view-dependent lighting, and scene motion. Our compact model decomposes the scene into view-dependent ray offset and color components, and with no volume sampling achieves real-time 1080p rendering.
We design a spherical neural light field model for implicit panoramic image stitching and re-rendering, capable of handling depth parallax, view-dependent lighting, and scene motion. Our compact model decomposes the scene into view-dependent ray offset and color components, and with no volume sampling achieves real-time 1080p rendering.
Split-aperture 2-in-1 computational cameras encode half the aperture with a diffractive optical
element to simultaneously capture optically coded and conventional images in a single device. Using
a dual-pixel sensor, our camera separates the wavefronts, retaining high-frequency content and
enabling single-shot high-dynamic-range, hyperspectral, and depth imaging.
Split-aperture 2-in-1 computational cameras encode half the aperture with a diffractive optical
element to simultaneously capture optically coded and conventional images in a single device. Using
a dual-pixel sensor, our camera separates the wavefronts, retaining high-frequency content and
enabling single-shot high-dynamic-range, hyperspectral, and depth imaging.
We propose neural spline fields, coordinate networks trained to map input 2D points to vectors of
spline control points, as a versatile representation of pixel motion during burst photography. This
flow model can fuse images during test-time optimization using just photometric loss, without
regularization. Layering these representations, we can separate effects such as occlusions,
reflections, shadows and more.
We propose neural spline fields, coordinate networks trained to map input 2D points to vectors of
spline control points, as a versatile representation of pixel motion during burst photography. This
flow model can fuse images during test-time optimization using just photometric loss, without
regularization. Layering these representations, we can separate effects such as occlusions,
reflections, shadows and more.
In a “long-burst”, forty-two 12-megapixel RAW frames captured in a two-second sequence, there is
enough parallax information from natural hand tremor alone to recover high-quality scene depth. We
fit a neural RGB-D model directly to this long-burst data to recover depth and camera motion with no
LiDAR, no external pose estimates, and no disjoint preprocessing steps.
In a “long-burst”, forty-two 12-megapixel RAW frames captured in a two-second sequence, there is
enough parallax information from natural hand tremor alone to recover high-quality scene depth. We
fit a neural RGB-D model directly to this long-burst data to recover depth and camera motion with no
LiDAR, no external pose estimates, and no disjoint preprocessing steps.
Signed distance fields (SDFs) can be a compact and convenient way of representing 3D objects, but
state-of-the-art learned methods for SDF estimation struggle to fit more than a few shapes at a
time. This work presents a two stage semi-supervised meta-learning approach that learns generic
shape priors to reconstruct over a hundred unseen object classes.
Signed distance fields (SDFs) can be a compact and convenient way of representing 3D objects, but
state-of-the-art learned methods for SDF estimation struggle to fit more than a few shapes at a
time. This work presents a two stage semi-supervised meta-learning approach that learns generic
shape priors to reconstruct over a hundred unseen object classes.
Modern AMCW time-of-flight (ToF) cameras are limited to modulation frequencies of several hundred
MHz by silicon absorption limits. In this work we leverage electro-optic modulators to build the
first free-space GHz ToF imager. To solve high-frequency phase ambiguities we alongside introduce a
segmentation-inspired neural phase unwrapping network.
Modern AMCW time-of-flight (ToF) cameras are limited to modulation frequencies of several hundred
MHz by silicon absorption limits. In this work we leverage electro-optic modulators to build the
first free-space GHz ToF imager. To solve high-frequency phase ambiguities we alongside introduce a
segmentation-inspired neural phase unwrapping network.
Modern smartphones can stream multi-megapixel RGB images, high-quality 3D pose information, and
low-resolution depth estimates at 60Hz. In tandem, the natural shake of a phone photographer's hand
provides us with dense micro-baseline parallax depth cues during viewfinding. This work explores how
we can combine these data streams to get a high-fidelity depth map from a single snapshot.
Modern smartphones can stream multi-megapixel RGB images, high-quality 3D pose information, and
low-resolution depth estimates at 60Hz. In tandem, the natural shake of a phone photographer's hand
provides us with dense micro-baseline parallax depth cues during viewfinding. This work explores how
we can combine these data streams to get a high-fidelity depth map from a single snapshot.
Flying pixels are pervasive depth artifacts in time-of-flight imaging, formed by light paths from
both an object and its background connecting to the same sensor pixel. Mask-ToF jointly learns a
microlens-level occlusion mask and refinement network to respectively encode and decode geometric
information in device measurements, helping reduce these artifacts while remaining light efficient.
Flying pixels are pervasive depth artifacts in time-of-flight imaging, formed by light paths from
both an object and its background connecting to the same sensor pixel. Mask-ToF jointly learns a
microlens-level occlusion mask and refinement network to respectively encode and decode geometric
information in device measurements, helping reduce these artifacts while remaining light efficient.
Jupyter Notebook labs can offer a similar experience to in-person lab sections while being
self-contained, with relevant resources embedded in their cells. They interactively demonstrate
real-life applications of signal processing while reducing overhead for course staff.
Jupyter Notebook labs can offer a similar experience to in-person lab sections while being
self-contained, with relevant resources embedded in their cells. They interactively demonstrate
real-life applications of signal processing while reducing overhead for course staff.
Leveraging sparsity in the Z-spectrum domain, multi-scale low rank reconstruction of cardiac
chemical exchange saturation transfer (CEST) MRI can allow for 4-fold acceleration of scans while
providing accurate Lorentzian line-fit analysis.
Leveraging sparsity in the Z-spectrum domain, multi-scale low rank reconstruction of cardiac
chemical exchange saturation transfer (CEST) MRI can allow for 4-fold acceleration of scans while
providing accurate Lorentzian line-fit analysis.
Point cloud data integrated from two structured light sensors for gesture recognition implicitly via
a 3D spatial transform network can lead to improved results as compared to iterative closest point
(ICP) registered point clouds.
Point cloud data integrated from two structured light sensors for gesture recognition implicitly via
a 3D spatial transform network can lead to improved results as compared to iterative closest point
(ICP) registered point clouds.