Action Localization Models

BMN

Introduction

@inproceedings{lin2019bmn,
  title={Bmn: Boundary-matching network for temporal action proposal generation},
  author={Lin, Tianwei and Liu, Xiao and Li, Xin and Ding, Errui and Wen, Shilei},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={3889--3898},
  year={2019}
}
@article{zhao2017cuhk,
  title={Cuhk \& ethz \& siat submission to activitynet challenge 2017},
  author={Zhao, Y and Zhang, B and Wu, Z and Yang, S and Zhou, L and Yan, S and Wang, L and Xiong, Y and Lin, D and Qiao, Y and others},
  journal={arXiv preprint arXiv:1710.08011},
  volume={8},
  year={2017}
}

Model Zoo

ActivityNet feature

config feature gpus AR@100 AUC AP@0.5 AP@0.75 AP@0.95 mAP gpu_mem(M) iter time(s) ckpt log json
bmn_400x100_9e_2x8_activitynet_feature cuhk_mean_100 2 75.28 67.22 42.47 31.31 9.92 30.34 5420 3.27 ckpt log json
mmaction_video 2 75.43 67.22 42.62 31.56 10.86 30.77 5420 3.27 ckpt log json
mmaction_clip 2 75.35 67.38 43.08 32.19 10.73 31.15 5420 3.27 ckpt log json
BMN-official (for reference)* cuhk_mean_100 - 75.27 67.49 42.22 30.98 9.22 30.00 - - - - -
  • Notes:

  1. The gpus indicates the number of gpu we used to get the checkpoint. According to the Linear Scaling Rule, you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU, e.g., lr=0.01 for 4 GPUs x 2 video/gpu and lr=0.08 for 16 GPUs x 4 video/gpu.

  2. For feature column, cuhk_mean_100 denotes the widely used cuhk activitynet feature extracted by anet2016-cuhk, mmaction_video and mmaction_clip denote feature extracted by mmaction, with video-level activitynet finetuned model or clip-level activitynet finetuned model respectively.

  3. We evaluate the action detection performance of BMN, using anet_cuhk_2017 submission for ActivityNet2017 Untrimmed Video Classification Track to assign label for each action proposal.

*We train BMN with the official repo, evaluate its proposal generation and action detection performance with anet_cuhk_2017 for label assigning.

For more details on data preparation, you can refer to ActivityNet feature in Data Preparation.

Train

You can use the following command to train a model.

python tools/train.py ${CONFIG_FILE} [optional arguments]

Example: train BMN model on ActivityNet features dataset.

python tools/train.py configs/localization/bmn/bmn_400x100_2x8_9e_activitynet_feature.py

For more details and optional arguments infos, you can refer to Training setting part in getting_started .

Test

You can use the following command to test a model.

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]

Example: test BMN on ActivityNet feature dataset.

## Note: If evaluated, then please make sure the annotation file for test data contains groundtruth.
python tools/test.py configs/localization/bmn/bmn_400x100_2x8_9e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth --eval AR@AN --out results.json

You can also test the action detection performance of the model, with anet_cuhk_2017 prediction file and generated proposal file (results.json in last command).

python tools/analysis/report_map.py --proposal path/to/proposal_file

Notes:

  1. (Optional) You can use the following command to generate a formatted proposal file, which will be fed into the action classifier (Currently supports SSN and P-GCN, not including TSN, I3D etc.) to get the classification result of proposals.

    python tools/data/activitynet/convert_proposal_format.py
    

For more details and optional arguments infos, you can refer to Test a dataset part in getting_started .

BSN

Introduction

@inproceedings{lin2018bsn,
  title={Bsn: Boundary sensitive network for temporal action proposal generation},
  author={Lin, Tianwei and Zhao, Xu and Su, Haisheng and Wang, Chongjing and Yang, Ming},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  pages={3--19},
  year={2018}
}

Model Zoo

ActivityNet feature

config feature gpus pretrain AR@100 AUC gpu_mem(M) iter time(s) ckpt log json
bsn_400x100_1x16_20e_activitynet_feature cuhk_mean_100 1 None 74.66 66.45 41(TEM)+25(PEM) 0.074(TEM)+0.036(PEM) ckpt_tem ckpt_pem log_tem log_pem json_tem json_pem
mmaction_video 1 None 74.93 66.74 41(TEM)+25(PEM) 0.074(TEM)+0.036(PEM) ckpt_tem ckpt_pem log_tem log_pem json_tem json_pem
mmaction_clip 1 None 75.19 66.81 41(TEM)+25(PEM) 0.074(TEM)+0.036(PEM) ckpt_tem ckpt_pem log_tem log_pem json_tem json_pem

Notes:

  1. The gpus indicates the number of gpu we used to get the checkpoint. According to the Linear Scaling Rule, you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU, e.g., lr=0.01 for 4 GPUs x 2 video/gpu and lr=0.08 for 16 GPUs x 4 video/gpu.

  2. For feature column, cuhk_mean_100 denotes the widely used cuhk activitynet feature extracted by anet2016-cuhk, mmaction_video and mmaction_clip denote feature extracted by mmaction, with video-level activitynet finetuned model or clip-level activitynet finetuned model respectively.

For more details on data preparation, you can refer to ActivityNet feature in Data Preparation.

Train

You can use the following commands to train a model.

python tools/train.py ${CONFIG_FILE} [optional arguments]

Examples:

  1. train BSN(TEM) on ActivityNet features dataset.

    python tools/train.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py
    
  2. train BSN(PEM) on PGM results.

    python tools/train.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py
    

For more details and optional arguments infos, you can refer to Training setting part in getting_started.

Inference

You can use the following commands to inference a model.

  1. For TEM Inference

    ## Note: This could not be evaluated.
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
    
  2. For PGM Inference

    python tools/bsn_proposal_generation.py ${CONFIG_FILE} [--mode ${MODE}]
    
  3. For PEM Inference

    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
    

Examples:

  1. Inference BSN(TEM) with pretrained model.

    python tools/test.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth
    
  2. Inference BSN(PGM) with pretrained model.

    python tools/bsn_proposal_generation.py configs/localization/bsn/bsn_pgm_400x100_activitynet_feature.py --mode train
    
  3. Inference BSN(PEM) with evaluation metric ‘AR@AN’ and output the results.

    ## Note: If evaluated, then please make sure the annotation file for test data contains groundtruth.
    python tools/test.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py  checkpoints/SOME_CHECKPOINT.pth  --eval AR@AN --out results.json
    

Test

You can use the following commands to test a model.

  1. TEM

    ## Note: This could not be evaluated.
    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
    
  2. PGM

    python tools/bsn_proposal_generation.py ${CONFIG_FILE} [--mode ${MODE}]
    
  3. PEM

    python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
    

Examples:

  1. Test a TEM model on ActivityNet dataset.

    python tools/test.py configs/localization/bsn/bsn_tem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth
    
  2. Test a PGM model on ActivityNet dataset.

    python tools/bsn_proposal_generation.py configs/localization/bsn/bsn_pgm_400x100_activitynet_feature.py --mode test
    
  3. Test a PEM model with with evaluation metric ‘AR@AN’ and output the results.

    python tools/test.py configs/localization/bsn/bsn_pem_400x100_1x16_20e_activitynet_feature.py checkpoints/SOME_CHECKPOINT.pth --eval AR@AN --out results.json
    

Notes:

  1. (Optional) You can use the following command to generate a formatted proposal file, which will be fed into the action classifier (Currently supports only SSN and P-GCN, not including TSN, I3D etc.) to get the classification result of proposals.

    python tools/data/activitynet/convert_proposal_format.py
    

For more details and optional arguments infos, you can refer to Test a dataset part in getting_started.

SSN

Introduction

@InProceedings{Zhao_2017_ICCV,
author = {Zhao, Yue and Xiong, Yuanjun and Wang, Limin and Wu, Zhirong and Tang, Xiaoou and Lin, Dahua},
title = {Temporal Action Detection With Structured Segment Networks},
booktitle = {Proceedings of the IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

Model Zoo

config gpus backbone pretrain mAP@0.3 mAP@0.4 mAP@0.5 reference mAP@0.3 reference mAP@0.4 reference mAP@0.5 gpu_mem(M) ckpt log json refrence ckpt refrence json
ssn_r50_450e_thumos14_rgb 8 ResNet50 ImageNet 29.37 22.15 15.69 27.61 21.28 14.57 6352 ckpt log json ckpt json
  • Notes:

  1. The gpus indicates the number of gpu we used to get the checkpoint. According to the Linear Scaling Rule, you may set the learning rate proportional to the batch size if you use different GPUs or videos per GPU, e.g., lr=0.01 for 4 GPUs x 2 video/gpu and lr=0.08 for 16 GPUs x 4 video/gpu.

  2. Since SSN utilizes different structured temporal pyramid pooling methods at training and testing, please refer to ssn_r50_450e_thumos14_rgb_train at training and ssn_r50_450e_thumos14_rgb_test at testing.

  3. We evaluate the action detection performance of SSN, using action proposals of TAG. For more details on data preparation, you can refer to thumos14 TAG proposals in Data Preparation.

  4. The reference SSN in is evaluated with ResNet50 backbone in MMAction, which is the same backbone with ours. Note that the original setting of MMAction SSN uses the BNInception backbone.

Train

You can use the following command to train a model.

python tools/train.py ${CONFIG_FILE} [optional arguments]

Example: train SSN model on thumos14 dataset.

python tools/train.py configs/localization/ssn/ssn_r50_450e_thumos14_rgb_train.py

For more details and optional arguments infos, you can refer to Training setting part in getting_started.

Test

You can use the following command to test a model.

python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]

Example: test BMN on ActivityNet feature dataset.

## Note: If evaluated, then please make sure the annotation file for test data contains groundtruth.
python tools/test.py configs/localization/ssn/ssn_r50_450e_thumos14_rgb_test.py checkpoints/SOME_CHECKPOINT.pth --eval mAP

For more details and optional arguments infos, you can refer to Test a dataset part in getting_started.