HiπŸ‘‹3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

1The Chinese University of Hong Kong, Shenzhen,   2ByteDance,   3Tsinghua University
teaser image

HiπŸ‘‹3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging.

Abstract

With the growing demand for high-fidelity 3D models from 2D images, existing methods still face significant challenges in accurately reproducing fine-grained geometric details due to limitations in domain gaps and inherent ambiguities in RGB images. To address these issues, we propose Hi3DGen, a novel framework for generating high-fidelity 3D geometry from images via normal bridging. Hi3DGen consists of three key components: (1) an image-to-normal estimator that decouples the low-high frequency image pattern with noise injection and dual-stream training to achieve generalizable, stable, and sharp estimation; (2) a normal-to-geometry learning approach that uses normal-regularized latent diffusion learning to enhance 3D geometry generation fidelity; and (3) a 3D data synthesis pipeline that constructs a high-quality dataset to support training. Extensive experiments demonstrate the effectiveness and superiority of our framework in generating rich geometric details, outperforming state-of-the-art methods in terms of fidelity. Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.

teaser image

Method

The First Stage: Image-to-Normal Estimation. Left part: Illustration of Noise-injected Regressive Normal Estimation (NiRNE); Right part: Noisy label at high-frequency regions in real-domain data.

Left Top Image

Normal estimation results. Qualitative comparisons.

Left Bottom Image

The Second stage: Normal-to-Geometry Generation. An illustration of Normal-Regularized Latent Diffusion (NoRLD).

Right Top Image

The construction procedure of the proposed DetailVerse dataset, which contains high-quality synthesized 3D assets to support the training of Hi3DGen. The pipeline contains three steps: text prompt coleection, image generation, and 3D assests synthesis.

Right Bottom Image
Image Image

Image-to-3D Generation Results

Click on the images below to see our generated results as geometry-only meshes in a 3D viewer.

πŸ’‘Tips

● Scroll to zoom in/out

● Drag to rotate

● Press "shift" and drag to pan

Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image

Comparison to Other Methods

Select a method from the dropdown menu to compare the results of Hi3DGen with it side by side.

Hi3DGen
πŸ’‘Tips

● Scroll to zoom in/out

● Drag to rotate

● Press "shift" and drag to pan

*Geometry-only results.
Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image Image

Generation Gallery

Zoom in for better visualization of each 3D model.

BibTeX

@article{ye2025hi3dgen,
  title={Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging},
  author={Ye, Chongjie and Wu, Yushuang and Lu, Ziteng and Chang, Jiahao and Guo, Xiaoyang and Zhou, Jiaqing and Zhao, Hao and Han, Xiaoguang},
  journal={arXiv preprint arXiv:2503.22236}, 
  year={2025}
}