This is the official PyTorch implementation of our work: "Dual-Recommendation Disentanglement Network for View Fuzz in Action Recognition".
In this paper, we present a novel approach and we define a new problem for Multi-view action recognition. We asses the performance of our method and previous state-of-the-art methods on N-UCLA and NTU-RGB+D datasets, We do some experimental analysis at IXMAS dataset.
This repository uses the following libraries:
- Python >= 3.6
- Numpy
- PyTorch >= 1.3
- fvcore:
pip install 'git+https://github.com/facebookresearch/fvcore' - torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this.
- simplejson:
pip install simplejson - GCC >= 4.9
- PyAV:
conda install av -c conda-forge - ffmpeg (4.0 is prefereed, will be installed along with PyAV)
- PyYaml: (will be installed along with fvcore)
- tqdm: (will be installed along with fvcore)
- iopath:
pip install -U iopathorconda install -c iopath iopath - psutil:
pip install psutil - OpenCV:
pip install opencv-python - torchvision:
pip install torchvisionorconda install torchvision -c pytorch - tensorboard:
pip install tensorboard - moviepy: (optional, for visualizing video on tensorboard)
conda install -c conda-forge moviepyorpip install moviepy - PyTorchVideo:
pip install pytorchvideo
The most important file is run.py, that is in charge to start the training or test procedure. To run it, simpy use the following command:
python tools/run_net.py --cfg <cfg_path> DATA_LOADER.NUM_WORKERS 0 NUM_GPUS <GPU_NUM> BATCHH_SIZE <BATCHH_SIZE_NUM> SOLVER.BASE_LR <LR_NUM> SOLVER.MAX_EPOCH <EPOCH_NUM> SOLVER.WEIGHT_DECAY <WEIGHT_DECAY_NUM> SOLVER.WARMUP_EPOCHS 0.0 DATA.PATH_TO_DATA_DIR <DATA_PATH>
The default is to use a pretraining for the backbone used, that is searched in the pretrained folder of the project. We used the pretrained model released by the Kinetics (as said in the paper), that can be found here: link.
Here we can start with training a simple C2D models by running:
python tools/run_net.py \
--cfg configs/Kinetics/C2D_8x8_R50.yaml \
DATA.PATH_TO_DATA_DIR path_to_your_dataset \
NUM_GPUS 2 \
TRAIN.BATCH_SIZE 16 \
You may need to pass location of your dataset in the command line by adding DATA.PATH_TO_DATA_DIR path_to_your_dataset, or you can simply add
DATA:
PATH_TO_DATA_DIR: path_to_your_dataset
To the yaml configs file, then you do not need to pass it to the command line every time.
You may also want to add:
DATA_LOADER.NUM_WORKERS 0 \
NUM_GPUS 2 \
TRAIN.BATCH_SIZE 16 \
If you want to launch a quick job for debugging on your local machine.
If your checkpoint is trained by PyTorch, then you can add the following line in the command line, or you can also add it in the YAML config:
TRAIN.CHECKPOINT_FILE_PATH path_to_your_PyTorch_checkpoint
If the checkpoint in trained by Caffe2, then you can do the following:
TRAIN.CHECKPOINT_FILE_PATH path_to_your_Caffe2_checkpoint \
TRAIN.CHECKPOINT_TYPE caffe2
If you need to performance inflation on the checkpoint, remember to set TRAIN.CHECKPOINT_INFLATE to True.
We have TRAIN.ENABLE and TEST.ENABLE to control whether training or testing is required for the current job. If only testing is preferred, you can set the TRAIN.ENABLE to False, and do not forget to pass the path to the model you want to test to TEST.CHECKPOINT_FILE_PATH.
python tools/run_net.py \
--cfg configs/Kinetics/C2D_8x8_R50.yaml \
DATA.PATH_TO_DATA_DIR path_to_your_dataset \
TEST.CHECKPOINT_FILE_PATH path_to_your_checkpoint \
TRAIN.ENABLE False \
python \tools\run_net.py --cfg path/to/<pretrained_model_config_file>.yaml
DRDN is written and maintained by Wenxuan Liu, Xian Zhong, Zhuo Zhou, Kui Jiang, Zheng Wang, Chia-Wen Lin.
If you find DRDN useful in your research, please use the following BibTeX entry for citation.
@article{DBLP:journals/tip/LiuZZJWL23,
author = {Wenxuan Liu and
Xian Zhong and
Zhuo Zhou and
Kui Jiang and
Zheng Wang and
Chia{-}Wen Lin},
title = {Dual-Recommendation Disentanglement Network for View Fuzz in Action
Recognition},
journal = {{IEEE} Trans. Image Process.},
volume = {32},
pages = {2719--2733},
year = {2023},
}