Visual Learning and Reasoning for Robotics


Full-day workshop at RSS 2021

Virtual Conference

July 13, 2021, Pacific Time (PT)


Welcome! This workshop includes three live events:
  • Invited Talks (25 min talk + 5 min Q&A)
  • Spotlight Talks (4 min talk + 2 min Q&A)
  • Panel Discussion (60 min)
To attend the workshop, please use the pheedloop platform provided by RSS 2021.

For the panel discussion, you can also post questions at this link.

Schedule


Time (PT) Invited Speaker Title
10:15 - 10:30 -
Opening Remarks
| Video |
10:30 - 11:00 Image

Andrew Davison
Imperial College London
Representations for Spatial AI
| Video |
11:00 - 11:30 Image

Raquel Urtasun
University of Toronto / Waabi
Interpretable Neural Motion Planning
| Video |
11:30 - 12:00 Spotlight Talks
+
Q&A
ZePHyR: Zero-shot Pose Hypothesis Rating
Brian Okorn (Carnegie Mellon University); Qiao Gu (Carnegie Mellon University)*; Martial Hebert (Carnegie Mellon University); David Held (Carnegie Mellon University)
| PDF | Video |


ST-DETR: Spatio-Temporal Object Traces Attention Detection Transformer
Eslam Bakr (Valeo)*; Ahmad ElSallab (Valeo Deep Learning Research)
| PDF | Video |


Lifelong Interactive 3D Object Recognition for Real-Time Robotic Manipulation
Hamed Ayoobi (University of Groningen)*; S. Hamidreza Kasaei (University of Groningen); Ming Cao (University of Groningen); Rineke Verbrugge (University of Groningen); Bart Verheij (University of Groningen)
| PDF | Video |


Predicting Diverse and Plausible State Foresight For Robotic Pushing Tasks
Lingzhi Zhang (University of Pennsylvania)*; Shenghao Zhou (University of Pennsylvania); Jianbo Shi (University of Pennsylvania)
| PDF | Video |


Learning by Watching: Physical Imitation of Manipulation Skills from Human Videos
Haoyu Xiong (University of Toronto, Vector Institute)*; Quanzhou Li (University of Toronto, Vector Institute); Yun-Chun Chen (University of Toronto, Vector Institute); Homanga Bharadhwaj (University of Toronto, Vector Institute); Samarth Sinha (University of Toronto, Vector Institute); Animesh Garg (University of Toronto, Vector Institute, NVIDIA)
| PDF | Video |


12:00 - 12:30 Image

Abhinav Gupta
CMU / Facebook AI Research
No RL, No Simulation
| Video |
12:30 - 1:00 Image

Shuran Song
Columbia University
Unfolding the Unseen: Deformable Cloth Perception and Manipulation
| Video |
1:00 - 2:30 - Break
2:30 - 3:00 Image

Saurabh Gupta
UIUC
Learning to Move and Moving to Learn
| Video |
3:00 - 3:30 Image

Sergey Levine
UC Berkeley / Google
Scalable Robotic Learning
| Video |
3:30 - 4:00 Spotlight Talks
+
Q&A
3D Neural Scene Representations for Visuomotor Control
Yunzhu Li (MIT)*; Shuang Li (MIT); Vincent Sitzmann (MIT); Pulkit Agrawal (MIT); Antonio Torralba (MIT)
| PDF | Video |


Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation
Nicklas A Hansen (UC San Diego)*; Hao Su (UC San Diego); Xiaolong Wang (UC San Diego)
| PDF | Video |


Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
Ruihan Yang (UC San Diego)*; Minghao Zhang (Tsinghua University); Nicklas A Hansen (UC San Diego); Huazhe Xu (UC Berkeley); Xiaolong Wang (UC San Diego)
| PDF | Video |


Interaction Prediction and Monte-Carlo Tree Search for Robot Manipulation in Clutter
Baichuan Huang (Rutgers University)*; Abdeslam Boularias (Rutgers University); Jingjin Yu (Rutgers University)
| PDF | Video |


A Simple Method for Complex In-Hand Manipulation
Tao Chen (MIT)*; Jie Xu (MIT); Pulkit Agrawal (MIT)
| PDF | Video |


4:00 - 5:00 Invited Speakers
Panel Discussion
| Video |

Introduction


Visual perception is essential for achieving robot autonomy in the real world. To perform complex robot tasks in unknown environments, a robot needs to actively acquire knowledge through physical interactions and conduct sophisticated reasoning of the observed objects. This invites a series of research challenges in developing computational tools to close the perception-action loop. Given the recent advances in computer vision and deep learning, we look for new potential solutions for performing real-world robotic tasks in an effective and computationally efficient manner.

We focus on the two parallel themes in this workshop:

Call for Papers


We're inviting submissions! If you're interested in (remotely) presenting a spotlight talk, please submit a short paper (or extended abstract) to CMT. We suggest extended abstracts of 2 pages in the RSS format. A maximum of 4 pages will be considered. References will not count towards the page limit. The review process is double-blind. Significant overlap with work submitted to other venues is acceptable, but it must be explicitly stated at the time of submission.

Important Dates:

Organizers


Image

Kuan Fang
Stanford University
Image

David Held
CMU
Image

Yuke Zhu
UT Austin / NVIDIA
Image

Dinesh Jayaraman
Univ. of Pennsylvania
Image

Animesh Garg
Univ. of Toronto / NVIDIA
Image

Lin Sun
Magic Leap
Image

Yu Xiang
NVIDIA
Image

Greg Dudek
McGill / Samsung

Past Workshops


Contact


For further information, please contact us at rssvlrr [AT] gmail [DOT] com