To train the latent action autoencoder (suppose you train on 1 node with 8 GPUs):
cd lam
train.shNote
The checkpoint of our latent action autoencoder can be found at Hugging Face.
- Download the pretrained Stable Video Diffusion checkpoint
svd.safetensorsfrom Hugging Face. - Reset
default_ckptinworldmodel/train.pywith the path ofsvd.safetensors. - Reset
ckpt_pathinworldmodel/configs/training/adaworld.yamlwith the last checkpoint path of the latent action autoencoder.
To pretrain the autoregressive world model (suppose you train on 1 node with 8 GPUs):
cd worldmodel
run_train.shAfter training:
- Convert the DeepSpeed checkpoints (depends on how many GPUs you have) to
pytorch_model.binusingzero_to_fp32.py. - Convert
pytorch_model.binto safetensors format usingworldmodel/bin_to_st.pyand do inference.
Note
The pretrained AdaWorld can be found at Hugging Face.
Tip
Remember to modify num_nodes and devices in lam/config/lam.yaml accordingly if you have a different GPU setup.
Remember to set max_epochs or stop the training when you think the training is long enough.
<= Previous: [Installation]
=> Next: [Action Transfer]