Overview¶

Number of checkpoints: 161
Number of configs: 127
Number of papers: 22
- ALGORITHM: 18
- BACKBONE: 1
- DATASET: 2
- OTHERS: 1

For supported datasets, see datasets overview.

Spatio Temporal Action Detection Models ¶

Number of checkpoints: 15
Number of configs: 17
Number of papers: 4
- [ALGORITHM] Long-Term Feature Banks for Detailed Video Understanding (⇨)
- [ALGORITHM] Omni-Sourced Webly-Supervised Learning for Video Recognition (⇨)
- [ALGORITHM] Slowfast Networks for Video Recognition (⇨)
- [DATASET] Ava: A Video Dataset of Spatio-Temporally Localized Atomic Visual Actions (⇨)

Action Localization Models ¶

Number of checkpoints: 7
Number of configs: 3
Number of papers: 4
- [ALGORITHM] Bmn: Boundary-Matching Network for Temporal Action Proposal Generation (⇨)
- [ALGORITHM] Bsn: Boundary Sensitive Network for Temporal Action Proposal Generation (⇨)
- [ALGORITHM] Temporal Action Detection With Structured Segment Networks (⇨)
- [DATASET] Cuhk & Ethz & Siat Submission to Activitynet Challenge 2017 (⇨)

Action Recognition Models ¶

Number of checkpoints: 139
Number of configs: 107
Number of papers: 16
- [ALGORITHM] A Closer Look at Spatiotemporal Convolutions for Action Recognition (⇨)
- [ALGORITHM] Audiovisual Slowfast Networks for Video Recognition (⇨)
- [ALGORITHM] Learning Spatiotemporal Features With 3d Convolutional Networks (⇨)
- [ALGORITHM] Omni-Sourced Webly-Supervised Learning for Video Recognition (⇨)
- [ALGORITHM] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset (⇨)
- [ALGORITHM] Slowfast Networks for Video Recognition (⇨ ⇨)
- [ALGORITHM] Tam: Temporal Adaptive Module for Video Recognition (⇨)
- [ALGORITHM] Temporal Interlacing Network (⇨)
- [ALGORITHM] Temporal Pyramid Network for Action Recognition (⇨)
- [ALGORITHM] Temporal Relational Reasoning in Videos (⇨)
- [ALGORITHM] Temporal Segment Networks: Towards Good Practices for Deep Action Recognition (⇨)
- [ALGORITHM] Tsm: Temporal Shift Module for Efficient Video Understanding (⇨)
- [ALGORITHM] Video Classification With Channel-Separated Convolutional Networks (⇨)
- [ALGORITHM] X3d: Expanding Architectures for Efficient Video Recognition (⇨)
- [BACKBONE] Non-Local Neural Networks (⇨ ⇨)
- [OTHERS] Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition (⇨)