EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

Park, Junho; Ye, Andrew Sangwoo; Kwon, Taein

Computer Science > Computer Vision and Pattern Recognition

arXiv:2506.17896 (cs)

[Submitted on 22 Jun 2025 (v1), last revised 4 Mar 2026 (this version, v2)]

Title:EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

Authors:Junho Park, Andrew Sangwoo Ye, Taein Kwon

View PDF HTML (experimental)

Abstract:Egocentric vision is essential for both human and machine visual understanding, particularly in capturing the detailed hand-object interactions needed for manipulation tasks. Translating third-person views into first-person views significantly benefits augmented reality (AR), virtual reality (VR) and robotics applications. However, current exocentric-to-egocentric translation methods are limited by their dependence on 2D cues, synchronized multi-view settings, and unrealistic assumptions such as the necessity of an initial egocentric frame and relative camera poses during inference. To overcome these challenges, we introduce EgoWorld, a novel framework that reconstructs an egocentric view from rich exocentric observations, including point clouds, 3D hand poses, and textual descriptions. Our approach reconstructs a point cloud from estimated exocentric depth maps, reprojects it into the egocentric perspective, and then applies diffusion model to produce dense, semantically coherent egocentric images. Evaluated on four datasets (i.e., H2O, TACO, Assembly101, and Ego-Exo4D), EgoWorld achieves state-of-the-art performance and demonstrates robust generalization to new objects, actions, scenes, and subjects. Moreover, EgoWorld exhibits robustness on in-the-wild examples, underscoring its practical applicability. Project page is available at this https URL.

Comments:	Accepted by ICLR 2026. Project Page: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2506.17896 [cs.CV]
	(or arXiv:2506.17896v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2506.17896

Submission history

From: Junho Park [view email]
[v1] Sun, 22 Jun 2025 04:21:48 UTC (19,439 KB)
[v2] Wed, 4 Mar 2026 11:37:54 UTC (29,650 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

Submission history

Access Paper:

Current browse context:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:EgoWorld: Translating Exocentric View to Egocentric View using Rich Exocentric Observations

Submission history

Access Paper:

Current browse context:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators