Code for paper DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training (ICML'2024). It pretrains neural operator transformers (from 7M to 1B) on multiple PDE datasets. Pre-trained weights could be found at https://huggingface.co/hzk17/DPOT.
Our pre-trained DPOT achieves the state-of-the-art performance on multiple PDE datasets and could be used for finetuning on different types of downstream PDE problems.
We have five pre-trained checkpoints of different sizes. Pre-trained weights are at https://huggingface.co/hzk17/DPOT.
| Size | Attention dim | MLP dim | Layers | Heads | Model size |
|---|---|---|---|---|---|
| Tiny | 512 | 512 | 4 | 4 | 7M |
| Small | 1024 | 1024 | 6 | 8 | 30M |
| Medium | 1024 | 4096 | 12 | 8 | 122M |
| Large | 1536 | 6144 | 24 | 16 | 509M |
| Huge | 2048 | 8092 | 27 | 8 | 1.03B |
Here is an example code of loading pre-trained model.
model = DPOTNet(img_size=128, patch_size=8, mixing_type='afno', in_channels=4, in_timesteps=10, out_timesteps=1, out_channels=4, normalize=False, embed_dim=512, modes=32, depth=4, n_blocks=4, mlp_ratio=1, out_layer_dim=32, n_cls=12)
model.load_state_dict(torch.load('model_Ti.pth')['model'])All datasets are stored using hdf5 format, containing data field. Some datasets are stored with individual hdf5 files, others are stored within a single hdf5 file.
In data_generation/preprocess.py, we have the script for preprocessing the datasets from each source. Download the original file from these sources and preprocess them to /data folder.
| Dataset | Link |
|---|---|
| FNO data | Here |
| PDEBench data | Here |
| PDEArena data | Here |
| CFDbench data | Here |
In utils/make_master_file.py , we have all dataset configurations. When new datasets are merged, you should add a configuration dict. It stores all relative paths so that you could run on any places.
mkdir dataNow we have a single GPU pretraining code script train_temporal.py, you could start it by
python train_temporal.py --model DPOT --train_paths ns2d_fno_1e-5 --test_paths ns2d_fno_1e-5 --gpu 0 to start a training process.
Or you could start it by writing a configuration file in configs/ns2d.yaml and start it by automatically using free GPUs with
python trainer.py --config_file ns2d.yamlpython parallel_trainer.py --config_file ns2d_parallel.yamlNow I use yaml as the configuration file. You could specify parameters for args. If you want to run multiple tasks, you could move parameters into the tasks ,
model: DPOT
width: 512
tasks:
lr: [0.001,0.0001]
batch_size: [256, 32] This means that you start 2 tasks if you submit this configuration to trainer.py.
Install the following packages via conda-forge
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.7 -c pytorch -c nvidia
conda install matplotlib scikit-learn scipy pandas h5py -c conda-forge
conda install timm einops tensorboard -c conda-forgeREADME.mdtrain_temporal.py: main code of single GPU pre-training auto-regressive modeltrainer.py: framework of auto scheduling training tasks for parameter tuningutils/criterion.py: loss functions of relative errorgriddataset.py: dataset of mixture of temporal uniform grid datasetmake_master_file.py: datasets config filenormalizer: normalization methods (#TODO: implement instance reversible norm)optimizer: Adam/AdamW/Lamb optimizer supporting complex numbersutilities.py: other auxiliary functions
configs/: configuration files for pre-training or fine-tuningmodels/dpot.py: DPOT modelfno.py: FNO with group normalizationmlp.py
data_generation/: Some code for preprocessing data (ask hzk if you want to use them)darcy/ns2d/
If you use DPOT in your research, please use the following BibTeX entry.
@article{hao2024dpot,
title={DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training},
author={Hao, Zhongkai and Su, Chang and Liu, Songming and Berner, Julius and Ying, Chengyang and Su, Hang and Anandkumar, Anima and Song, Jian and Zhu, Jun},
journal={arXiv preprint arXiv:2403.03542},
year={2024}
}

