Dhruv Patel

Dhruv Patel

About me

I am an MS Robotics student at the School of Interactive Computing, Georgia Tech. I currently conduct research on Cross-Embodiment Learning for Robot Manipulation from Human-Play data at the Robot Learning and Reasoning (RL2) lab with Dr. Danfei Xu . Presently, I am working on extending our recent work, EgoMimic (CoRL 2024) for diverse multi-task settings and various axes of generalization (behavior, scene), exploring Vision-Language Models (VLMs) and large-scale embodied human datasets. My interests broadly lie at the intersection of Robotics, Computer Vision and Deep Learning.

UPDATE: I am actively seeking full-time opportunities in AI/Robotics starting May 2025 — feel free to reach out if there's a fit!

I spent summer 2024 interning at Honda Research Institute, USA, working on Scene Understanding for Autonomous Driving at Intersections. Before Georgia Tech, I explored open-source software development as a Google Summer of Code'23 contributor at Unify AI, with an aim to optimize SLAM for real-world deployment in Robotics applications. Prior to this, I was a Project Associate at the Robotics Research Centre (RRC), where I worked on Scene Understanding for Autonomous Driving in Adverse Weather Conditions (affiliated with Queensland University of Technology (QUT) and ZF Group, and spearheaded the IHub Project Mobility on UAV-based Visual Remote Sensing for Civil Infrastructure Safety Assessment. I spent the summer of 2020 working on Simultaneous Localization and Mapping (SLAM) for Level-5 Autonomy at Swaayatt Robots. Post this, I transitioned to a Software Engineering role at Amdocs, and alongside, collaborated with the Norwegian Biometrics Laboratory (NTNU, Norway) to conduct research on Image Super-Resolution problem.

I am always open to collaborations, research discussions, or just an interesting chat on AI & Robotics. Feel free to connect with me on LinkedIn or via email!

Interests

  • Robotics & Computer Vision
  • Deep Learning
  • AI & Neuroscience

Education

  • MS in Robotics, August 2023- May 2025

    Georgia Institute of Technology (Georgia Tech)

  • B.Tech in Electronics & Communication Engg, July 2016 - July 2020

    Sardar Vallabhbhai National Institute of Technology, Surat

Recent

Work Experience

 
 
 
 
 
Georgia Tech Logo

Graduate Student Researcher

Robot Learning and Reasoning Lab (RL2), Georgia Tech

Nov 2023 - Present
Advisor: Dr. Danfei Xu
Cross-Embodiment Learning for Robot Manipulation using Embodied Human-Play Data [WebPage]
Keywords: Robot Learning, Manipulation, Imitation Learning
 
 
 
 
 
Honda Research Institute Logo

Research Associate Intern

Honda Research Institute, USA

May 2024 - August 2024
Developed perception algorithms for intersection detection and navigation for HRI’s Autonomous Vehicle (AV) platform.
Keywords: Scene Understanding, ADAS, Robotics, Deep Learning
 
 
 
 
 
Google Summer of Code Logo

Open-Source Software Developer

Google Summer of Code 2023

June 2023 - August 2023
Multi-backend framework support of GradSLAM in Ivy [WebSite]

  • Google Summer of Code '23 Contributor at Ivy - unify.ai
  • Developed a multi-backend framework support (PyTorch, JAX, NumPy, Tensorflow) for GradSLAM library in Ivy, with an aim to optimize deployment through highly efficient frameworks like JAX.
Keywords: Robotics, Deep Learning, PyTorch, JAX, NumPy, Tensorflow
 
 
 
 
 
Robotics Research Centre Logo

Project Associate

Robotics Research Centre (RRC), IIIT Hyderabad

July 2021 – July 2023
Scene Understanding for Autonomous Driving
Advisors: Prof. Madhava Krishna and Dr. Sourav Garg

  • Collaborated with ZF Friedrichshafen group and Queensland University of Technology (QUT) Robotics on improving perception and scene understanding for adverse weather conditions.
  • Proposed GDIP: Gated Differentiable Image Processing which establishes a new SOTA for object detection in foggy and low-lighting conditions.
  • Researched downstream problems like video object detection/tracking and explored Probabilistic Graphical Models (PGMs) for weather-agnostic feature refinement.


UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)
Advisors: Prof. Madhava Krishna, Dr. Ravi Kiran, and Dr. Harikumar Kandath

  • Automated assessment of civil structures with the help of visual remote sensing.
  • Utilized Structure-from-Motion, state estimation, odometry, etc., in conjunction with classical Computer Vision and Deep Learning-based visual inspection algorithms to robustly estimate critical structural parameters.
  • Developed and released an open-source library (UVRSABI) for the community. More details here.
Keywords: Robotics, Computer Vision, Deep Learning, Reinforcement Learning, 3D Reconstruction, Autonomous Driving, ADAS, UAVs
 
 
 
 
 
AMDOCS Logo

Software Engineer

AMDOCS

Aug 2020 – June 2021 Pune
Scrum Master/Team Lead: Shreyas Kulkarni
  • Responsible for B2B production-level full-stack software development.
  • Developed cross-functional telecom software solutions for Comcast's Orion project (USA).
  • Technical Stack: Java, ReactJS, SQL, Spring Boot, Maven, and Jenkins.
Keywords: Java, SQL, ReactJS, Object-oriented Programming, Microservices, Jenkins, Maven, Spring
 
 
 
 
 
Swaayatt Robots Logo

Research Intern

Swaayatt Robots

April 2020 – July 2020
Advisor: Sanjeev Sharma (Founder & CEO - Swaayatt Robots)
  • Improved Visual Odometry and SLAM pipelines for Level-5 Autonomy.
  • Devised a semantic variant of the Iterative Closest Point (ICP) algorithm, outperforming vanilla ICP in terms of matching loss and convergence time on the Semantic KITTI dataset.
  • Developed a low-level C++ library.
Keywords: Robotics, Mathematical Optimization, SLAM, ICP, LiDARs
 
 
 
SVNIT Surat Logo

Deep Learning Intern

Sardar Vallabhbhai National Institute Of Technology, Surat

May 2019 - July 2019
Advisor: Dr. Kishor Upla (Assistant Professor, ECED)
  • Implemented the state-of-the-art FaceNet paper and validated it on a custom dataset of 25 students.
Keywords: Face Recognition, Deep Learning

Projects

.js-id-Self
Image

Zero-shot policy adaptation: Diffusion Models + LLMs

Zero-shot adaptation leveraging inference-time flexibility (diffusion models) and expressivity (LLMs)

Image

EgoMimic: Scaling Imitation Learning via Egocentric Video

End-to-end robot learning for generalizable bimanual robot manipulation from embodied human data

Image

Asia-Pacific Robotics Contest 2018 & 2019

Autonomous Navigation for OmniDrive and Quadruped robots

Image

Behavior Cloning and Dynamics Models for Robot Manipulation

MLP, RNN, Diffusion policy variants and learning dynamics models in Robomimic

Image

Vision-Language Models for Dense Feedback Reward

VLMs for Natural Language Human Feedback

Image

UAV-based Assessment of Civil Structures

Automated building inspection using the aerial images captured using UAV.

Image

Obstacle Avoidance for UAV

Predicting an obstacle-free patch for high level control commands

Image

Fytbuddy: A real-time gym fitness trainer

A web app-based e-trainer using a flask web server and a Deep Learning-based model for posture correction

Image

Autonomous Agricultural Robot

AGRIBOT to solve crop weed classification problem

Image

RFID System

Identification system using RFID reader, LCD display and Atmel AVR microcontroller.

Image

Image Super-Resolution

A triplet loss-based optimization framework for Image Super-Resolution

Image

Mapping for level-5 Autonomy

Improved point cloud registration and mapping using semantic ICP

Image

Object Detection in Adverse weather setting

Gated Differentiable Image Processing (GDIP), a domain-agnostic architecture for object detection in adverse conditions.

Image

Implementation of Path Searching/Tracking algorithms

Implemented path search/track algorithms like Pure Pursuit, Djikstra, A-star etc.

Teaching

Graduate Teaching Assistant
  • Georgia Tech CS 3630: Intro to Perception and Robotics (undergrad-level) (Spring 2025)
  • Georgia Tech CS 6476: Computer Vision (grad-level) (Fall 2024)
  • Georgia Tech CS 6476: Computer Vision (undergrad-level), Georgia Tech (Spring 2024)
  • Georgia Tech CS 6476: Computer Vision (grad-level), Georgia Tech (Fall 2023)
  • Featured

    • UAV-based Visual Remote Sensing for Automated Building Inspection (UVRSABI)
      • Spotlight paper presentation at the CVCIE Workshop at ECCV 2022
      • Inaugurated by Central Road Research Institute to deploy in Telangana, India (Sept 2022)
    • AGRIBOT got funded under TEQIP-III program by Govt. of India, and we also presented at the open-source ROS Agriculture community meet. [YouTube]
    • Secured 12th and 13th rank in Asia-Pacific Robot Contest - RoboCon 2018 and 2019 respectively, among 100-plus universities. [RoboCon2019 YouTube] [RoboCon2018 YouTube]
    • Best Working Model - Stirling Engine at the National Science Day Celebrations, Physical Research Laboratory (PRL), India, during 12th grade.