Skip to content

zwandering/SysNav

Repository files navigation

SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

Haokun Zhu*, Zongtai Li, Zihan Liu, Kevin Guo, Zhengzhi Lin, Yuxin Cai, Guofei Chen, Chen Lv, Wenshan Wang, Jean Oh, Ji Zhang

Carnegie Mellon University, New York University, Nanyang Technological University

[Project Page] [arXiv]

Image

News

  • [2026-03] Paper released on arXiv.
  • [2026-03] Project page is online.
  • [2026-04] Code released for Unity simulation, wheeled robot, Unitree Go2, and Unitree G1 platforms.

Abstract

Object navigation in real-world environments remains a significant challenge in embodied AI. We present SysNav, a three-level object navigation system that decouples semantic reasoning, navigation planning, and motion control. The framework employs Vision-Language Models for high-level semantic guidance and implements a hierarchical room-based navigation strategy that treats rooms as minimal decision-making units, combined with classical exploration for in-room navigation. Through 190 real-world experiments across three robot embodiments (wheeled, quadruped, humanoid), we demonstrate 4-5x improvement in navigation efficiency over existing baselines. The system also achieves state-of-the-art results on HM3D-v1, HM3D-v2, MP3D, and HM3D-OVON simulation benchmarks.

Demo

Long-range Object Navigation

Find Refrigerator in Lounge
Find Refrigerator
in Lounge.

▶ Watch on YouTube
Find Blue Trash Can in Classroom
Find Blue Trash Can
in Classroom.

▶ Watch on YouTube
Find Microwave Oven near Refrigerator
Find Microwave Oven
near Refrigerator.

▶ Watch on YouTube

Cross-Embodiment Object Navigation

System View Third-person View
Wheeled
Robot
wheeled_system_view.webm
wheeled_third_person.webm
Find the microwave_oven.
Quadruped
(Go2)
go2_system_view.webm
go2_third_person.webm
Find the blue trash_can.
Humanoid
(G1)
g1_system_view.webm
g1_third_person.webm
Find the tv_monitor on the black desk.

More demos on our project page.

Platforms

This repository supports three robot embodiments, each maintained on its own branch. Switch to the corresponding branch (git checkout unitree_go2 / git checkout unitree_g1) before building and running on a Unitree robot.

Wheeled Robot + Unity simulation — main (you are here)

  • Custom wheeled vehicle with Mecanum wheels (indoor carpet) or standard wheels (hard floor / outdoors)
  • Livox Mid-360 lidar + Ricoh Theta Z1 360-degree camera
  • Motor controller connected via USB serial (/dev/ttyACM0 by default)
  • Gaming laptop (RTX 4090) as the processing computer
  • PS3/Xbox-style joystick for teleoperation

Detailed hardware photos and assembly info: Real-robot Setup → Hardware.

Unitree Go2 Quadruped — unitree_go2

  • Unitree Go2 quadruped, controlled via WebRTC
  • Livox Mid-360 lidar + Ricoh Theta Z1 360-degree camera
  • Asus NUC 14 Pro (Intel Core Ultra 5) as the onboard computer
  • Desktop workstation / Laptop with NVIDIA RTX 4090 for the semantic mapping and VLM reasoning
  • Wired / WiFi network shared between robot, NUC, and desktop

Unitree G1 Humanoid — unitree_g1

  • Unitree G1 humanoid, controlled via WebRTC
  • Livox Mid-360 lidar + Ricoh Theta Z1 360-degree camera
  • Asus NUC 14 Pro (Intel Core Ultra 5) as the onboard computer
  • Desktop workstation / Laptop with NVIDIA RTX 4090 for the semantic mapping and VLM reasoning
  • Wired / WiFi network shared between robot, NUC, and desktop

Contents

Installation

The system has been tested on Ubuntu 24.04 with ROS2 Jazzy.

1) Dependencies

Install ROS2 Jazzy, then:

echo "source /opt/ros/jazzy/setup.bash" >> ~/.bashrc
source ~/.bashrc

Install system dependencies:

sudo apt update
sudo apt install ros-jazzy-desktop-full ros-jazzy-pcl-ros libpcl-dev git
sudo apt install -y nlohmann-json3-dev
sudo apt install ros-jazzy-backward-ros

2) Submodules and Python Packages

git submodule update --init --recursive

pip install -r requirement.txt --break-system-package

# detectron2
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git' --break-system-package

# pytorch3d
pip install "git+https://github.com/facebookresearch/pytorch3d.git" --no-build-isolation --break-system-package

# sam2
cd src/semantic_mapping/semantic_mapping/external/sam2
pip install -e . --break-system-package
cd checkpoints && ./download_ckpts.sh && cd ../..

# spacy
python -m spacy download en_core_web_sm --break-system-package

# CLIP
pip install git+https://github.com/ultralytics/CLIP.git --break-system-package

# YOLO models
python set_yolo_e.py
python set_yolo_world.py

3) SLAM Dependencies

Install Sophus (from src/slam/dependency/Sophus):

mkdir build && cd build
cmake .. -DBUILD_TESTS=OFF
make && sudo make install

Install Ceres Solver (from src/slam/dependency/ceres-solver):

mkdir build && cd build
cmake ..
make -j6 && sudo make install

Install GTSAM (from src/slam/dependency/gtsam):

mkdir build && cd build
cmake .. -DGTSAM_USE_SYSTEM_EIGEN=ON -DGTSAM_BUILD_WITH_MARCH_NATIVE=OFF
make -j6 && sudo make install
sudo /sbin/ldconfig -v

4) Mid-360 Lidar Driver

Install Livox-SDK2 (from src/utilities/livox_ros_driver2/Livox-SDK2):

mkdir build && cd build
cmake ..
make && sudo make install

Configure the lidar IP in src/utilities/livox_ros_driver2/config/MID360_config.json — set the IP to 192.168.1.1xx where xx are the last two digits of the lidar serial number.

Compile the driver:

colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-select livox_ros_driver2

5) Compile

For simulation (skips SLAM and lidar driver):

colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release --packages-skip arise_slam_mid360 arise_slam_mid360_msgs livox_ros_driver2

For real robot (full build, requires steps 3-4):

colcon build --symlink-install --cmake-args -DCMAKE_BUILD_TYPE=Release

VLM API Key

The VLM node supports two providers via the OpenAI-compatible interface. Set one of the following:

Gemini (default) — get a key from Google AI Studio:

export GEMINI_API_KEY="your-api-key-here"

Qwen (DashScope) — get a key from Alibaba Cloud DashScope:

export DASHSCOPE_API_KEY="your-api-key-here"

If both keys are set, Gemini is used by default; override with export VLM_PROVIDER=qwen. Optionally override Qwen model names with QWEN_MODEL / QWEN_MODEL_LITE. Add the line(s) to ~/.bashrc so they persist across terminal sessions.

Simulation Setup

Base Autonomy

The system is integrated with Unity environment models for simulation. Download a Unity environment model (recommend home_building_1.zip) and unzip the files to the src/base_autonomy/vehicle_simulator/mesh/unity folder. For computers without a powerful GPU, please try the without_360_camera version for a higher rendering rate.

The environment model files should look like:

mesh/
  unity/
    environment/
      Model_Data/
      Model.x86_64
      UnityPlayer.so
      Dimensions.csv
      Categories.csv
    map.ply
    object_list.txt
    traversable_area.ply
    map.jpg
    render.jpg

Launch the system:

./system_simulation.sh

After seeing data showing up in RVIZ, users can use the 'Waypoint' button to set waypoints and navigate the vehicle around. The system supports three operating modes:

RVIZ
Base autonomy (smart joystick, waypoint, and manual modes)

  • Smart joystick mode (default): The vehicle follows joystick commands while avoiding collisions. Use the control panel in RVIZ or the right joystick on the controller.

  • Waypoint mode: The vehicle follows waypoints while avoiding collisions. Use the 'Waypoint' button in RVIZ, or click 'Resume Navigation to Goal' to switch to this mode.

  • Manual mode: The vehicle follows joystick commands without collision avoidance. Press the 'manual-mode' button on the controller.

RVIZ Control Panel      PS3 Controller

Alternatively, users can run a ROS node to send a series of waypoints:

source install/setup.sh
ros2 launch waypoint_example waypoint_example.launch

Click the 'Resume Navigation to Goal' button in RVIZ, and the vehicle will navigate inside the boundary following the waypoints. More information about the base autonomy system is available on the Autonomous Exploration Development Environment website.

Exploration Planner

Launch the system with the exploration planner:

./system_simulation_with_exploration_planner.sh

Click the 'Resume Navigation to Goal' button in RVIZ to start the exploration. Users can adjust the navigation boundary by updating the boundary polygon in src/exploration_planner/tare_planner/data/boundary.ply.

Note: On ARM computers, download the corresponding OR-Tools binary release and replace the include and lib folders under src/exploration_planner/tare_planner/or-tools.

RVIZ with Exploration Planner
Base autonomy with exploration planner

Real-robot Setup

Hardware

The vehicle hardware is designed to support advanced AI. Space is left for users to install a Jetson AGX Orin computer or a gaming laptop. The vehicle is equipped with a 19V and a 110V inverter (both 400W) to power sensors and computers. A wireless HDMI module transmits signals to a control station.

We supply two types of wheels: Mecanum wheels for indoor carpet, and standard wheels for hard floor and outdoors.

All Items      Computer Space

Control Station

Wheel Types

System Setup

Install Ubuntu 24.04 and ROS2 Jazzy on the processing computer. Add user to the dialout group:

echo "source /opt/ros/jazzy/setup.bash" >> ~/.bashrc
source ~/.bashrc
sudo adduser 'username' dialout
sudo reboot now

Follow the Installation section to install all dependencies and compile the full repository. For the motor controller, connect it via USB and update the serial device path in src/base_autonomy/local_planner/launch/local_planner.launch and src/utilities/teleop_joy_controller/launch/teleop_joy_controller.launch if needed (default: /dev/ttyACM0).

Test the teleoperation:

source install/setup.sh
ros2 launch teleop_joy_controller teleop_joy_controller.launch

360 Camera Driver

The system uses a Ricoh Theta Z1 360-degree camera. The camera driver and lidar-to-camera calibration tools are maintained in a separate repository — clone it alongside this repo and follow its README to build and configure:

https://github.com/jizhang-cmu/360_camera/tree/jazzy

System Usage

Launch the full system:

./system_real_robot.sh

Launch with the exploration planner:

./system_real_robot_with_exploration_planner.sh

Exploration
Exploration

Bagfile Setup

To run the system with a recorded bagfile, open three terminals:

Terminal 1 - Launch the system:

./system_bagfile.sh
# or with exploration planner:
./system_bagfile_with_exploration_planner.sh

Terminal 2 - Republish camera images:

ros2 run image_transport republish \
  --ros-args \
  -p in_transport:=compressed \
  -p out_transport:=raw \
  --remap in/compressed:=/camera/image/compressed \
  --remap out:=/camera/image

Terminal 3 - Play the bagfile:

source install/setup.bash
ros2 bag play bagfolder_path/bagfile_name.mcap

Example bagfiles are available here.

Note: Before processing bagfiles, ensure the repository has been fully compiled following the Installation section.

Credits

The project is led by Ji Zhang's group at Carnegie Mellon University.

The base autonomy system is based on Autonomous Exploration Development Environment. The SLAM module is an upgraded implementation of LOAM.

Citation

If you find this work useful, please consider citing:

@article{zhu2026sysnav,
  title={SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation},
  author={Zhu, Haokun and Li, Zongtai and Liu, Zihan and Guo, Kevin and Lin, Zhengzhi and Cai, Yuxin and Chen, Guofei and Lv, Chen and Wang, Wenshan and Oh, Jean and Zhang, Ji},
  journal={arXiv preprint arXiv:2603.06914},
  year={2026}
}

License

This project is licensed under the BSD 3-Clause License.

Some third-party packages retain their original open-source licenses (BSD, MIT, Apache 2.0, GPLv3). See individual package.xml files for per-package license declarations.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors