TactileDreamFusion

3D content creation with touch: TactileDreamFusion integrates high-resolution tactile sensing with diffusion-based image priors to enhance fine geometric details for text- or image-to-3D generation. The following results are rendered using Blender, with full-color rendering on the top and normal rendering at the bottom.

Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation
Ruihan Gao, Kangle Deng, Gengshan Yang, Wenzhen Yuan, Jun-Yan Zhu
Carnegie Mellon University
NeurIPS 2024

Results

The following results are rendered using blender-render-toolkit.

Same Object with Diverse Textures

We show diverse textures synthesized on the same object, which facilitates the custom design of 3D assets.

Single Texture Generation

We show 3D generation with a single texture. Our method generates realistic and coherent visual textures and geometric details.

Multi-Part Texture Generation

This grid demonstrates different render types for each object: predicted label map, albedo, normal map, zoomed-in normal patch, and full-color rendering.

Getting Started

Environment setup

Our environment has been tested on linux, python 3.10.13, pytorch 2.2.1, and CUDA 12.1.

git clone https://github.com/RuihanGao/TactileDreamFusion.git
cd TactileDreamFusion
conda create -n TDF python=3.10
conda activate TDF
pip install torch==2.2.1+cu121 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt
git clone https://github.com/dunbar12138/blender-render-toolkit.git
cd blender-render-toolkit
git checkout tactileDreamfusion

Hardware requirements

All results in the paper were produced on a single NVIDIA A6000 (48 GB), which is what the default single-texture config is tuned for (batch_size: 4, patch_batch_size: 4 in configs/text_tactile_TSDS.yaml). The multi-part config is lighter and already defaults to batch_size: 1.

On smaller GPUs, edit configs/text_tactile_TSDS.yaml and lower batch_size to trade throughput for memory. A 24 GB card (e.g. RTX 3090 / 4090) fits batch_size: 2 for single-texture training. If you still hit OOM, also reduce patch_batch_size (default 4).

Download dataset and TextureDreamBooth weights

Run the following script to download our tactile texture dataset and example base meshes to data folder.

bash scripts/download_hf_dataset.sh

Run the following script to download our pretrained TextureDreamBooth weights and patch data to TextureDreambooth folder.

bash scripts/download_hf_model.sh

Single texture generation

Training

bash scripts/single_texture_generation.sh -train

It takes about 10 mins on a single A6000 gpu to run single texture generation for a mesh.

All artifacts are written to logs/<save_path>/ (for the default script, logs/an_avocado_2_avocado_example/):

Mesh & textures — <save_path>.obj + .mtl, _albedo.png, _tactile_normal.png are the final textured mesh; _initialized.* is the checkpoint written after the 150-iter init stage, before refinement begins.
Config & loss logs — _opt.json (the resolved config used for the run), _loss_dict_all.pkl (per-iter value for every loss term), _loss_plot.png (loss curves).
Per-iter debug videos (one frame per iter, 256×256) show how each signal evolves over training:
- _rendered_albedos_list.mp4 / _rendered_target_albedos_list.mp4 — learned vs. target albedo
- _rendered_lambertians_list.mp4 — shaded rendering fed to the ControlNet SDS loss
- _rendered_perturb_normals_list.mp4 / _rendered_target_perturb_normals_list.mp4 — learned vs. target tactile normal
- _rendered_guidance_perturb_normals_list.mp4 — TextureDreambooth-refined normal used as the tactile guidance target
- _controlnet_refined_images_list.mp4 / _controlnet_control_images_list.mp4 — ControlNet refinement output and its normal-map control
- _rendered_*_patch_list.mp4 — the same signals rendered from close-up patch views (used for tactile-scale supervision)
Side-by-side summary — _SDS_concat_rendering.mp4 shows (albedo | lambertian | ControlNet-refined) per-iter to inspect the SDS loop.
Visualization (nvdiffrast)

bash scripts/single_texture_generation.sh

For each rendering mode in {lambertian, albedo, tactile_normal, viewspace_normal, shading_normal}, the script writes into the same logs/<save_path>/ folder:

<save_path>_<mode>.mp4 — 360° camera orbit at a fixed elevation (1° azimuth step, 30 fps).
<elevation>_<azimuth>_light_<le>_<la>_<amb>_<mode>.png — front (azimuth 0) and back (azimuth 180) frames as PNGs.

Meaning of each mode:

albedo — learned RGB texture only (no lighting)
lambertian — diffuse shading using surface normal + tactile perturbation
tactile_normal — the learned high-frequency tactile normal map in tangent space
shading_normal — combined (surface + tactile) normal in world space
viewspace_normal — shading normal in camera space (comparable to ControlNet's BAE normals)

Example 360° orbit video (lambertian mode for an_avocado_2_avocado_example, file logs/an_avocado_2_avocado_example/an_avocado_2_avocado_example_lambertian.mp4). The other four modes follow the same filename pattern.

Visualization (blender) Note: After training, visualize different output meshes in logs directory by changing mesh_objs list in each bash script.

cd blender-render-toolkit
bash scripts/batch_blender_albedo.sh
bash scripts/batch_blender_normal.sh
bash scripts/batch_blender.sh

Each script writes into blender-render-toolkit/output/ — per-frame PNGs under <obj>_<modality>_rotate/ (252 frames, 360° orbit) and a concatenated <obj>_<modality>_rotate.mp4. batch_blender_normal.sh also produces single-frame PNGs <obj>_normal.png (textured) and <obj>_normal_geometry.png (base geometry only).

Example full-color 360° orbit video (blender-render-toolkit/output/an_avocado_2_avocado_example_full_color_rotate.mp4). See the matching _albedo_rotate.mp4 and _normal_rotate.mp4 in the same directory for the other two modalities.

Multi-part texture generation

Training

bash scripts/multi_part_texture_generation.sh -train

It takes about 15 mins on a single A6000 gpu to run multi-part texture generation for a mesh.

Outputs go to logs/<save_path>/ with the same layout as single-texture training (see above), plus:

_label_map.png — learned per-part segmentation (red = partA, green = partB), used at render time to mask the two tactile textures.
_rendered_labels_list.mp4, _rendered_labels_patch_list.mp4 — per-iter label field evolution for full and patch views.
_rendered_target_perturb_normal2s_list.mp4 / _rendered_guidance_perturb_normal2s_list.mp4 — partB's target tactile normal and TextureDreambooth-refined guidance (partA's uses the _perturb_normals videos as in the single-texture case).
_concat_patch_masks.mp4 — side-by-side (partA mask | partB mask | target albedo patch) across training iters, to inspect the DiffSeg segmentation.
Visualization (nvdiffrast)

bash scripts/multi_part_texture_generation.sh

Output layout is the same as single-texture visualization (orbit .mp4 plus front/back PNGs per mode). The multi-part script renders additional modes: tangent, normal, uv, and label_map (the per-part segmentation, red = partA, green = partB).

Citation

If you find this repository useful for your research, please cite the following work.

@inproceedings{gao2024exploiting,
      title     = {Tactile DreamFusion: Exploiting Tactile Sensing for 3D Generation},
      author    = {Gao, Ruihan and Deng, Kangle and Yang, Gengshan and Yuan, Wenzhen and Zhu, Jun-Yan},
      booktitle = {Conference on Neural Information Processing Systems (NeurIPS)},
      year      = {2024},
}

Acknowlegements

We thank Sheng-Yu Wang, Nupur Kumari, Gaurav Parmar, Hung-Jui Huang, and Maxwell Jones for their helpful comments and discussion. We are also grateful to Arpit Agrawal and Sean Liu for proofreading the draft. Kangle Deng is supported by the Microsoft research Ph.D. fellowship. Ruihan Gao is supported by the A*STAR National Science Scholarship (Ph.D.).

Part of this codebase borrows from DreamGaussian and DiffSeg.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
TextureDreambooth		TextureDreambooth
assets		assets
blender-render-toolkit @ 41191db		blender-render-toolkit @ 41191db
configs		configs
data/tactile_textures		data/tactile_textures
data_preprocessing		data_preprocessing
guidance		guidance
rsc		rsc
scripts		scripts
seg		seg
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
cam_utils.py		cam_utils.py
main.py		main.py
mesh_renderer_tactile.py		mesh_renderer_tactile.py
mesh_tactile.py		mesh_tactile.py
neural_style_field.py		neural_style_field.py
requirements.txt		requirements.txt
texture_desc.json		texture_desc.json
utils.py		utils.py
vis_render.py		vis_render.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TactileDreamFusion

Results

Same Object with Diverse Textures

Single Texture Generation

Multi-Part Texture Generation

Getting Started

Environment setup

Hardware requirements

Download dataset and TextureDreamBooth weights

Single texture generation

Multi-part texture generation

Citation

Acknowlegements

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TactileDreamFusion

Results

Same Object with Diverse Textures

Single Texture Generation

Multi-Part Texture Generation

Getting Started

Environment setup

Hardware requirements

Download dataset and TextureDreamBooth weights

Single texture generation

Multi-part texture generation

Citation

Acknowlegements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages