Hi, I'm Aditya Wagh

I'm a Research Engineer working on robot learning, with a focus on world models, vision-language-action (VLA) systems, and imitation learning.

My work sits at the intersection of machine learning, 3D computer vision, and robotics, and centers on building learning-based systems that connect perception, internal representations, and action in the real world.

I'm extremely grateful to be educated at two great institutions: NYU and BITS Pilani.

At the heart of my work is a simple belief: technology should amplify human potential, reduce barriers to exploration, and meaningfully enrich the human experience. I'm interested in building systems that don't just exist in the world, but help us better navigate it, create within it, and understand our place within it. Robotics is my primary lens for exploring these ideas, because it forces intelligence to engage directly with reality.

Alongside robotics, I'm excited by adjacent directions such as coding agents and AR/XR. To me, these are different expressions of the same underlying question — how intelligent systems can work with humans to extend creativity, reasoning, and interaction, rather than replace them.

Beyond my core work, I enjoy reading about programming languages, systems programming, performance engineering, and machine learning systems. I'm also drawn to topics like public policy, city planning, history, and psychology, especially where they intersect with how technology shapes human institutions and everyday life.

This site is a space to share what I'm building, learning, and thinking about over time.

If you're into any of these areas, let's connect! Feel free to reach out to me via: LinkedIn, Email or X/Twitter (Open DMs) to collaborate and have fun working together!

Tech I've Used Before

PythonC++CRustCUDABashMATLABHTMLCSSJavaScriptPyTorchJAXOpenCVOpen3DKerasTensorFlowHugging FaceScikit-learnNumPySciPyPandasCMakeONNXTensorRTDockerLinux

Projects

Below is a list of my notable projects. Feel free to have a look at whatever you find interesting!

SuperSLAM

SuperSLAM

SuperSLAM: Open Source Framework for Deep Learning based Visual SLAM

View Code

Deep Image Matching

Deep Image Matching using Local Feature Transformer

Pose estimation pipeline to enable 3D reconstruction using Local Feature Transformer (LoFTR)

View Code

State Estimation

State Estimation of a Drone by Visual-Inertial Sensor Fusion

Optimal state estimation using extended & unscented kalman filter implemented using MATLAB

View Code

3D Object Detection

3D Object Detection in LiDAR Point Clouds

Benchmarking of LiDAR 3D Object Detection networks (VoxelNet, PointNet++, PointPillars) on KITTI Dataset.

View Code

Fast Plane Extraction

Fast Plane Extraction from a 3D Point Cloud

Fast, Parallel implementation of the RANSAC algorithm to segment a plane in a point cloud.

View Code