ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation

Huang, Suning; Chen, Qianzhong; Zhang, Xiaohan; Sun, Jiankai; Schwager, Mac

Computer Science > Robotics

arXiv:2506.23126 (cs)

[Submitted on 29 Jun 2025 (v1), last revised 25 Aug 2025 (this version, v4)]

Title:ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation

Authors:Suning Huang, Qianzhong Chen, Xiaohan Zhang, Jiankai Sun, Mac Schwager

View PDF HTML (experimental)

Abstract:3D world models (i.e., learning-based 3D dynamics models) offer a promising approach to generalizable robotic manipulation by capturing the underlying physics of environment evolution conditioned on robot actions. However, existing 3D world models are primarily limited to single-material dynamics using a particle-based Graph Neural Network model, and often require time-consuming 3D scene reconstruction to obtain 3D particle tracks for training. In this work, we present ParticleFormer, a Transformer-based point cloud world model trained with a hybrid point cloud reconstruction loss, supervising both global and local dynamics features in multi-material, multi-object robot interactions. ParticleFormer captures fine-grained multi-object interactions between rigid, deformable, and flexible materials, trained directly from real-world robot perception data without an elaborate scene reconstruction. We demonstrate the model's effectiveness both in 3D scene forecasting tasks, and in downstream manipulation tasks using a Model Predictive Control (MPC) policy. In addition, we extend existing dynamics learning benchmarks to include diverse multi-material, multi-object interaction scenarios. We validate our method on six simulation and three real-world experiments, where it consistently outperforms leading baselines by achieving superior dynamics prediction accuracy and less rollout error in downstream visuomotor tasks. Experimental videos are available at this https URL.

Subjects:	Robotics (cs.RO)
Cite as:	arXiv:2506.23126 [cs.RO]
	(or arXiv:2506.23126v4 [cs.RO] for this version)
	https://doi.org/10.48550/arXiv.2506.23126

Submission history

From: Suning Huang [view email]
[v1] Sun, 29 Jun 2025 07:23:56 UTC (2,763 KB)
[v2] Tue, 1 Jul 2025 16:38:33 UTC (2,763 KB)
[v3] Fri, 4 Jul 2025 07:47:12 UTC (2,763 KB)
[v4] Mon, 25 Aug 2025 18:24:56 UTC (2,763 KB)

Computer Science > Robotics

Title:ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Robotics

Title:ParticleFormer: A 3D Point Cloud World Model for Multi-Object, Multi-Material Robotic Manipulation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators