M4Human: A large-scale Multimodal mmWave Radar-based 3D Human Mesh Estimation Benchmark

This repository contains training and evaluation code for 3D human mesh estimation from mmWave data.

Overview of M4Human, the current largest multimodal dataset for high-fidelity mmWave radar-based human motion sensing. It covers diverse free-space motions (e.g., rehabilitation, exercise, and sports) beyond simple in-place actions, with high-quality marker-based motion annotations. Such diversity supports a broad range of human sensing tasks, including tracking, human mesh recovery, action recognition, and human motion generation, as well as privacy-preserving applications in elderly care, rehabilitation, robotics, and VR gaming.

News 🔥

M4Human is released! The largest-scale multimodal mmWave human mesh benchmark, code and dataset will be available after paper publication.

Supported modalities in current code: radar_points (Radar Point Cloud (RPC)), rawImage_XYZ (Radar Tensor (RT))
Supported model names in current code: P4Transformer (RPC) , RT-Mesh (RT), RETR (RT)
Clear dataset split configuration is read through dataset/dataset_config_clean.py.
Distributed training and evaluation are supported via torchrun. (Click and Run)

1. Abstract

Human mesh reconstruction (HMR) provides direct insights into body-environment interaction, which enables various immersive applications. While existing large-scale HMR datasets rely heavily on line-of-sight RGB input, vision-based sensing is limited by occlusion, lighting variation, and privacy concerns. To overcome these limitations, recent efforts have explored radio-frequency (RF) mmWave radar for privacy-preserving indoor human sensing. However, current radar datasets are constrained by sparse skeleton labels, limited scale, and simple in-place actions.

To advance the HMR research community, we introduce M4Human, the current largest-scale (661K-frame) (9 times prior largest) multimodal benchmark, featuring high-resolution mmWave radar, RGB, and depth data. M4Human provides both raw radar tensors (RT) and processed radar point clouds (RPC) to enable research across different levels of RF signal granularity. M4Human includes high-quality motion capture (MoCap) annotations with 3D meshes and global trajectories, and spans 20 subjects and 50 diverse actions, including in-place, sit-in-place, and free-space sports or rehabilitation movements. We establish benchmarks on both RT and RPC modalities, as well as multimodal fusion with RGB-D modalities. Extensive results highlight the significance of M4Human for radar-based human modeling while revealing persistent challenges under fast, unconstrained motion. The dataset and code will be released after the paper publication.

2. Method Overview

The training loop predicts SMPL-X parameters (root center, root orientation, body shape, body pose, gender), from a sequence of T=4 frames radar inputs (RPC or RT).

Input temporal length is controlled by temporal_window in dataset/m4human_dataset.py.
Supervision is currently single-frame SMPL-X parameter.
Evaluation aggregates metrics over 50 actions.

Overview of the proposed RT-Mesh baseline. Given a 3D radar tensor (RT), RT-Mesh first reshapes it into a 2D BEV representation. A lightweight 2D BEV Transformer, combining 2D convolution and self-attention, performs efficient 2D human localization $(\hat{x}, \hat{y})$ under the supervision of $\mathcal{L}_{2D}$. A local 3D RoI is cropped from the full RT volume based on $(\hat{x}, \hat{y})$, which is then processed by 3D convolution and 3D Transformer to extract fine-grained 3D mesh features. Finally, an HMR head regresses SMPL-X parameters for 3D mesh.

3. Repository Structure

M4Human-main/                                       
|-- config.yaml                                     # Dataset Benchmark config *
|-- main1_multigpu_clean.py                         # Main code *
|-- dataset/                                        # Dataset Setup code
|   |-- m4human_dataset.py
|   |-- m4human_utils.py
|   |-- dataset_config_clean.py
|   |-- lmdb_utils.py
|-- mmwave_models/                                  # Models
|   |-- Point_models/
|   |-- Tensor_models/
|       |-- RTmesh/
|       |-- retr_models/
|-- sources/
|   |-- Train_and_model_loss.py                     # Loss Definition
|   |-- evaluation_module_pc_multigpu.py            # Evaluation utilities
|   |-- Train_and_model_plotting_3D_mesh.py         # Plotting functions
|-- models/smplx/
|-- experiments/

Click&Run (Step 1.1): Environment Setup

We test our code in the following environment:

Ubuntu 20.04
Python 3.9
PyTorch 2.2.0
CUDA 11.8

Create the environment using:

conda env create -f environment.yml

Our point-based method requires CUDA PointNet++ acceleration. Follow the setup instructions in the P4Transformer dependency: https://github.com/erikwijmans/Pointnet2_PyTorch. RT-based does not require CUDA setup.

Click&Run (Step 1.2): SMPL Models Setup

Download SMPL models from official source or https://smpl-x.is.tue.mpg.de/.
Place them under models/smplx/.

M4Human-main/
    models/
        smplx/
            SMPLX_FEMALE.npz
            SMPLX_FEMALE.pkl
            SMPLX_MALE.npz
            SMPLX_MALE.pkl
            SMPLX_NEUTRAL.npz
            SMPLX_NEUTRAL.pkl
            smplx_npz.zip
            version.txt

Current code reads .npz files from config.yaml -> paths.smplx.

Click&Run (Step 2): Download Datasets

Full raw dataset (around 2T zip files (Partially Uploaded)): Here

Full processed dataset (radar modality) (recommended, around 50G LMDB files): Here

After downloading processed dataset, organize folders (recommended outside repo):

M4Human-main/                      # Main repo folder
mmDataset/                         # (Full processed dataset)
    MR-Mesh/
        rf3dpose_all/
            calib.lmdb
            radar_comp.lmdb        # (RT)
            radar_comp.lmdb-lock
            radar_pc.lmdb          # (RPC)
            radar_pc.lmdb-lock
            params.lmdb            # (GT params)
            indeces.pkl.gz         # (dataset split configuration)
            ... (other .lmdb and lmdb-lock files)

Training cache path is configured by config.yaml:

paths:
  cached_root: '../mmDataset/MR-Mesh/'

Expected LMDB set in loader:

<cached_root>/rf3dpose_all/
|-- radar_comp.lmdb
|-- radar_pc.lmdb
|-- params.lmdb
|-- calib.lmdb
|-- indicator.lmdb
|-- image.lmdb            # currently not support due to large image modality size, set use_image=True to load image modality.
|-- indeces.pkl.gz

Click&Run (Step 3): Check Experiment Configuration (`config.yaml`)

Benchmark behavior is controlled by config.yaml.

Key	Meaning	Typical Values
`model.name`	Model selector	`P4Transformer`, `RT-Mesh`, `RETR`
`model.modality`	Input modality key	`radar_points`, `rawImage_XYZ`
`train.batch_size`	Per-process batch size	e.g. `64`
`train.lr`	Adam LR	e.g. `2e-4`
`train.loss_weights.*`	Weighted terms in `combined_loss`	float
`eval.test_mode`	Evaluation-only mode	`true/false`
`eval.plot_gif`	Save GIF during eval	`true/false`
`dataset.protocol`	Ratio protocol	`p1`, `p2`, `p3`
`dataset.split`	Split strategy	`s1`, `s2`, `s3`

Valid model-modality combinations:

radar_points + P4Transformer
rawImage_XYZ + RT-Mesh
rawImage_XYZ + RETR

Per sample from dataset/m4human_dataset.py:

rawImage_XYZ: temporal radar tensor sequence
radar_points: temporal padded radar point cloud
parameter: SMPL-X parameters in radar frame
vertices: generated mesh vertices
joints_root: root and selected joints
bbbox: AABB from params
projected_vertices: 2D projected vertices
indicator: [sub_id, act_id, frame_id]
calibration: calibration dict

Notes:

Protocol is different from (IP, SIP, NIP) in main paper, controlling how much porpotion of dataset is used. For example, we can choose 'p3' with 25% of subjects for fast training and evaluating (see dataset size VS performance in Fig. 5 (main paper)). (IP, SIP, NIP) are directly reported in the results table.
Point modality applies z-offset normalization (-1.5) in dataset loader.
Input uses temporal context (temporal_window in dataset/m4human_dataset.py), while supervision is single-frame.

Click&Run (Step 4): Benchmarking (Radar Modality Only)

To train the benchmark: (Modify GPU num, currently 4)

torchrun --nproc_per_node=4 main1_multigpu_clean.py

To test the pretrained model:

eval:
  test_mode: true
  test_model_path: './experiments/exp_xxx/best_model_epoch_x.pth' # final best one

Then run again

torchrun --nproc_per_node=4 main1_multigpu_clean.py

Outputs

Outputs are saved to experiments/exp_YYYYMMDD_HHMMSS/:

train_test.log
results.csv
results_best.csv
best_model_epoch*.pth
test_epoch_*/

Our benchmark reports:

Mean vertex error (MVE)
Mean joint localization error (MJE)
Mean joint rotation error (degree) (MRE)
Mean mesh localization error (TE)
Mean gender accuracy (Gender Acc)

Metrics are aggregated per action.

Citation

If this project helps your research, please cite your paper here:

@article{fan2025m4human,
  title={M4Human: A Large-Scale Multimodal mmWave Radar Benchmark for Human Mesh Reconstruction},
  author={Fan, Junqiao and Zhou, Yunjiao and Yang, Yizhuo and Cui, Xinyuan and Zhang, Jiarui and Xie, Lihua and Yang, Jianfei and Lu, Chris Xiaoxuan and Ding, Fangqiang},
  journal={arXiv preprint arXiv:2512.12378},
  year={2025}
}

Acknowledgements

PyTorch ecosystem
Related mmWave and human mesh estimation projects

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
__pycache__		__pycache__
assets		assets
dataset		dataset
mmwave_models		mmwave_models
sources		sources
vis_depth		vis_depth
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
demo.ipynb		demo.ipynb
environment.yml		environment.yml
main1_multigpu_clean.py		main1_multigpu_clean.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

M4Human: A large-scale Multimodal mmWave Radar-based 3D Human Mesh Estimation Benchmark

News 🔥

1. Abstract

2. Method Overview

3. Repository Structure

Click&Run (Step 1.1): Environment Setup

Click&Run (Step 1.2): SMPL Models Setup

Click&Run (Step 2): Download Datasets

Click&Run (Step 3): Check Experiment Configuration (`config.yaml`)

Click&Run (Step 4): Benchmarking (Radar Modality Only)

Outputs

Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

M4Human: A large-scale Multimodal mmWave Radar-based 3D Human Mesh Estimation Benchmark

News 🔥

1. Abstract

2. Method Overview

3. Repository Structure

Click&Run (Step 1.1): Environment Setup

Click&Run (Step 1.2): SMPL Models Setup

Click&Run (Step 2): Download Datasets

Click&Run (Step 3): Check Experiment Configuration (config.yaml)

Click&Run (Step 4): Benchmarking (Radar Modality Only)

Outputs

Citation

Acknowledgements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Click&Run (Step 3): Check Experiment Configuration (`config.yaml`)

Packages