Skip to content

BeingBeyond/VIPA-VLA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Yicheng Feng1,3, Wanpeng Zhang1,3, Ye Wang2,3, Hao Luo1,3, Haoqi Yuan1,3,
Sipeng Zheng3, Zongqing Lu1,3†

1Peking University    2Renmin University of China    3BeingBeyond

Website arXiv License

Image

VIPA-VLA learns 2D–to–3D visual–physical grounding from human videos with spatial-aware VLA pretraining, enabling robot policies with stronger spatial understanding and generalization.

News

  • [2025-12-15]: We publish VIPA-VLA! Check our paper here. Code is coming soon! 🔥🔥🔥

Citation

If you find our work useful, please consider citing us and give a star to our repository! 🌟🌟🌟

@article{feng2025vipa,
  title={Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos},
  author={Feng, Yicheng and Zhang, Wanpeng and Wang, Ye and Luo, Hao and Yuan, Haoqi and  Zheng, Sipeng and Lu, Zongqing},
  journal={arXiv preprint arXiv:2512.13080},
  year={2025}
}

About

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •