Data Preparation

Notes on Video Data Format

MMAction2 supports two types of data format: raw frames and video. The former is widely used in previous projects such as TSN. This is fast when SSD is available but fails to scale to the fast-growing datasets. (For example, the newest edition of Kinetics has 650K videos and the total frames will take up several TBs.) The latter saves much space but has to do the computation intensive video decoding at execution time To make video decoding faster, we support several efficient video loading libraries, such as decord, PyAV, etc.

Supported Datasets

The supported datasets are listed below. We provide shell scripts for data preparation under the path $MMACTION2/tools/data/. To ease usage, we provide tutorials of data deployment for each dataset.

Now, you can switch to getting_started.md to train and test the model.

Getting Data

The following guide is helpful when you want to experiment with custom dataset. Similar to the datasets stated above, it is recommended organizing in $MMACTION2/data/$DATASET.

Prepare videos

Please refer to the official website and/or the official script to prepare the videos. Note that the videos should be arranged in either

(1). A two-level directory organized by ${CLASS_NAME}/${VIDEO_ID}, which is recommended to be used for for action recognition datasets (such as UCF101 and Kinetics)

(2). A single-level directory, which is recommended to be used for for action detection datasets or those with multiple annotations per video (such as THUMOS14).

Extract frames

To extract both frames and optical flow, you can use the tool denseflow we wrote. Since different frame extraction tools produce different number of frames, it is beneficial to use the same tool to do both frame extraction and the flow computation, to avoid mismatching of frame counts.

python build_rawframes.py ${SRC_FOLDER} ${OUT_FOLDER} [--task ${TASK}] [--level ${LEVEL}] \
    [--num-worker ${NUM_WORKER}] [--flow-type ${FLOW_TYPE}] [--out-format ${OUT_FORMAT}] \
    [--ext ${EXT}] [--new-width ${NEW_WIDTH}] [--new-height ${NEW_HEIGHT}] [--new-short ${NEW_SHORT}]
    [--resume] [--use-opencv]
  • SRC_FOLDER: Folder of the original video.

  • OUT_FOLDER: Root folder where the extracted frames and optical flow store.

  • TASK: Extraction task indicating which kind of frames to extract. Allowed choices are rgb, flow, both.

  • LEVEL: Directory level. 1 for the single-level directory or 2 for the two-level directory.

  • NUM_WORKER: Number of workers to build rawframes.

  • FLOW_TYPE: Flow type to extract, e.g., None, tvl1, warp_tvl1, farn, brox.

  • OUT_FORMAT: Output format for extracted frames, e.g., jpg, h5, png.

  • EXT: Video file extension, e.g., avi, mp4.

  • NEW_WIDTH: Resized image width of output.

  • NEW_HEIGHT: Resized image height of output.

  • NEW_SHORT: Resized image short side length keeping ratio.

  • --resume: Whether to resume optical flow extraction instead of overwriting.

  • --use-opencv: Whether to use OpenCV to extract rgb frames.

The recommended practice is

  1. set $OUT_FOLDER to be a folder located in SSD.

  2. symlink the link $OUT_FOLDER to $MMACTION2/data/$DATASET/rawframes.

ln -s ${YOUR_FOLDER} $MMACTION2/data/$DATASET/rawframes

Alternative to denseflow

In case your device doesn’t fulfill the installation requirement of denseflow(like Nvidia driver version), or you just want to see some quick demos about flow extraction, we provide a python script tools/flow_extraction.py as an alternative to denseflow. You can use it for rgb frames and optical flow extraction from one or several videos. Note that the speed of the script is much slower than denseflow, since it runs optical flow algorithms on CPU.

python tools/flow_extraction.py --input ${INPUT} [--prefix ${PREFIX}] [--dest ${DEST}] [--rgb-tmpl ${RGB_TMPL}] \
    [--flow-tmpl ${FLOW_TMPL}] [--start-idx ${START_IDX}] [--method ${METHOD}] [--bound ${BOUND}] [--save-rgb]
  • INPUT: Videos for frame extraction, can be single video or a video list, the video list should be a txt file and just consists of filenames without directories.

  • PREFIX: The prefix of input videos, used when input is a video list.

  • DEST: The destination to save extracted frames.

  • RGB_TMPL: The template filename of rgb frames.

  • FLOW_TMPL: The template filename of flow frames.

  • START_IDX: The start index of extracted frames.

  • METHOD: The method used to generate flow.

  • BOUND: The maximum of optical flow.

  • SAVE_RGB: Also save extracted rgb frames.

Generate file list

We provide a convenient script to generate annotation file list. You can use the following command to extract frames.

cd $MMACTION2
python tools/data/build_file_list.py ${DATASET} ${SRC_FOLDER} [--rgb-prefix ${RGB_PREFIX}] \
    [--flow-x-prefix ${FLOW_X_PREFIX}] [--flow-y-prefix ${FLOW_Y_PREFIX}] [--num-split ${NUM_SPLIT}] \
    [--subset ${SUBSET}] [--level ${LEVEL}] [--format ${FORMAT}] [--out-root-path ${OUT_ROOT_PATH}] \
    [--shuffle]
  • DATASET: Dataset to be prepared, e.g., ucf101, kinetics400, thumos14, sthv1, sthv2, etc.

  • SRC_FOLDER: Folder of the corresponding data format:

    • “$MMACTION2/data/$DATASET/rawframes” if --format rawframes.

    • “$MMACTION2/data/$DATASET/videos” if --format videos.

  • RGB_PREFIX: Name prefix of rgb frames.

  • FLOW_X_PREFIX: Name prefix of x flow frames.

  • FLOW_Y_PREFIX: Name prefix of y flow frames.

  • NUM_SPLIT: Number of split to file list.

  • SUBSET: Subset to generate file list. Allowed choice are train, val, test.

  • LEVEL: Directory level. 1 for the single-level directory or 2 for the two-level directory.

  • FORMAT: Source data format to generate file list. Allowed choices are rawframes, videos.

  • OUT_ROOT_PATH: Root path for output

  • --shuffle: Whether to shuffle the file list.

Preparing Datasets

ActivityNet

For basic dataset information, please refer to the official website. Here, we use the ActivityNet rescaled feature provided in this repo. Before we start, please make sure that current working directory is $MMACTION2/tools/data/activitynet/.

Step 1. Download Annotations

First of all, you can run the following script to download annotation files.

bash download_annotations.sh

Step 2. Prepare Videos Features

Then, you can run the following script to download activitynet features.

bash download_features.sh

Step 3. Process Annotation Files

Next, you can run the following script to process the downloaded annotation files for training and testing. It first merges the two annotation files together and then seperates the annoations by train, val and test.

python process_annotations.py

Step 4. Check Directory Structure

After the whole data pipeline for ActivityNet preparation, you will get the features and annotation files.

In the context of the whole project (for ActivityNet only), the folder structure will look like:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── ActivityNet
│   │   ├── anet_anno_{train,val,test,full}.json
│   │   ├── anet_anno_action.json
│   │   ├── video_info_new.csv
│   │   ├── activitynet_feature_cuhk
│   │   │   ├── csv_mean_100
│   │   │   │   ├── v___c8enCfzqw.csv
│   │   │   │   ├── v___dXUJsj3yo.csv
│   │   │   |   ├── ..

For training and evaluating on ActivityNet, please refer to getting_started.md.

HMDB51

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/hmdb51/.

To run the bash scripts below, you need to install unrar. you can install it by sudo apt-get install unrar, or refer to this repo by following the usage and taking zzunrar.sh script for easy installation without sudo.

Step 1. Prepare Annotations

First of all, you can run the following script to prepare annotations.

bash download_annotations.sh

Step 2. Prepare Videos

Then, you can run the following script to prepare videos.

bash download_videos.sh

Step 3. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance.

You can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/hmdb51_extracted/
ln -s /mnt/SSD/hmdb51_extracted/ ../../../data/hmdb51/rawframes

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames using “tvl1” algorithm.

bash extract_frames.sh

Step 4. Generate File List

you can run the follow script to generate file list in the format of rawframes and videos.

bash generate_rawframes_filelist.sh
bash generate_videos_filelist.sh

Step 5. Check Directory Structure

After the whole data process for HMDB51 preparation, you will get the rawframes (RGB + Flow), videos and annotation files for HMDB51.

In the context of the whole project (for HMDB51 only), the folder structure will look like:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── hmdb51
│   │   ├── hmdb51_{train,val}_split_{1,2,3}_rawframes.txt
│   │   ├── hmdb51_{train,val}_split_{1,2,3}_videos.txt
│   │   ├── annotations
│   │   ├── videos
│   │   │   ├── brush_hair
│   │   │   │   ├── April_09_brush_hair_u_nm_np1_ba_goo_0.avi

│   │   │   ├── wave
│   │   │   │   ├── 20060723sfjffbartsinger_wave_f_cm_np1_ba_med_0.avi
│   │   ├── rawframes
│   │   │   ├── brush_hair
│   │   │   │   ├── April_09_brush_hair_u_nm_np1_ba_goo_0
│   │   │   │   │   ├── img_00001.jpg
│   │   │   │   │   ├── img_00002.jpg
│   │   │   │   │   ├── ...
│   │   │   │   │   ├── flow_x_00001.jpg
│   │   │   │   │   ├── flow_x_00002.jpg
│   │   │   │   │   ├── ...
│   │   │   │   │   ├── flow_y_00001.jpg
│   │   │   │   │   ├── flow_y_00002.jpg
│   │   │   ├── ...
│   │   │   ├── wave
│   │   │   │   ├── 20060723sfjffbartsinger_wave_f_cm_np1_ba_med_0
│   │   │   │   ├── ...
│   │   │   │   ├── winKen_wave_u_cm_np1_ri_bad_1

For training and evaluating on HMDB51, please refer to getting_started.md.

Kinetics-400

For basic dataset information, please refer to the official website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/kinetics400/.

Step 1. Prepare Annotations

First of all, you can run the following script to prepare annotations.

bash download_annotations.sh

Step 2. Prepare Videos

Then, you can run the following script to prepare videos. The codes are adapted from the official crawler. Note that this might take a long time.

bash download_videos.sh

If you have already have a backup of the kinetics-400 dataset using the download script above, you only need to replace all whitespaces in the class name for ease of processing either by detox

### sudo apt-get install detox
detox -r ../../../data/kinetics400/videos_train/
detox -r ../../../data/kinetics400/videos_val/

or running

bash rename_classnames.sh

For better decoding speed, you can resize the original videos into smaller sized, densely encoded version by:

python ../resize_videos.py ../../../data/kinetics400/videos_train/ ../../../data/kinetics400/videos_train_256p_dense_cache --dense --level 2

Step 3. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance. And you can run the following script to soft link the extracted frames.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/kinetics400_extracted_train/
ln -s /mnt/SSD/kinetics400_extracted_train/ ../../../data/kinetics400/rawframes_train/
mkdir /mnt/SSD/kinetics400_extracted_val/
ln -s /mnt/SSD/kinetics400_extracted_val/ ../../../data/kinetics400/rawframes_val/

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames.

bash extract_frames.sh

These three commands above can generate images with size 340x256, if you want to generate images with short edge 320 (320p), you can change the args --new-width 340 --new-height 256 to --new-short 320. More details can be found in data_preparation

Step 4. Generate File List

you can run the follow scripts to generate file list in the format of videos and rawframes, respectively.

bash generate_videos_filelist.sh
### execute the command below when rawframes are ready
bash generate_rawframes_filelist.sh

Step 5. Folder Structure

After the whole data pipeline for Kinetics-400 preparation. you can get the rawframes (RGB + Flow), videos and annotation files for Kinetics-400.

In the context of the whole project (for Kinetics-400 only), the minimal folder structure will look like: (minimal means that some data are not necessary: for example, you may want to evaluate kinetics-400 using the original video format.)

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── kinetics400
│   │   ├── kinetics400_train_list_videos.txt
│   │   ├── kinetics400_val_list_videos.txt
│   │   ├── annotations
│   │   ├── videos_train
│   │   ├── videos_val
│   │   │   ├── abseiling
│   │   │   │   ├── 0wR5jVB-WPk_000417_000427.mp4
│   │   │   │   ├── ...
│   │   │   ├── ...
│   │   │   ├── wrapping_present
│   │   │   ├── ...
│   │   │   ├── zumba
│   │   ├── rawframes_train
│   │   ├── rawframes_val

For training and evaluating on Kinetics-400, please refer to getting_started.

Moments in Time

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/mit/.

Step 1. Prepare Annotations and Videos

First of all, you can run the following script to download the videos along with the annotations.

bash download_data.sh

For better decoding speed, you can resize the original videos into smaller sized, densely encoded version by:

python ../resize_videos.py ../../../data/mit/videos/ ../../../data/mit/videos_256p_dense_cache --dense --level 2

Step 2. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance. And you can run the following script to soft link the extracted frames.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/mit_extracted/
ln -s /mnt/SSD/mit_extracted/ ../../../data/mit/rawframes

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames.

bash extract_frames.sh

Step 4. Generate File List

you can run the follow script to generate file list in the format of rawframes and videos.

bash generate_{rawframes, videos}_filelist.sh

Step 5. Check Directory Structure

After the whole data process for Moments in Time preparation, you will get the rawframes (RGB + Flow), videos and annotation files for Moments in Time.

In the context of the whole project (for Moments in Time only), the folder structure will look like:

mmaction2
├── data
│   └── mit
│       ├── annotations
│       │   ├── license.txt
│       │   ├── moments_categories.txt
│       │   ├── README.txt
│       │   ├── trainingSet.csv
│       │   └── validationSet.csv
│       ├── mit_train_rawframe_anno.txt
│       ├── mit_train_video_anno.txt
│       ├── mit_val_rawframe_anno.txt
│       ├── mit_val_video_anno.txt
│       ├── rawframes
│       │   ├── training
│       │   │   ├── adult+female+singing
│       │   │   │   ├── 0P3XG_vf91c_35
│       │   │   │   │   ├── flow_x_00001.jpg
│       │   │   │   │   ├── flow_x_00002.jpg
│       │   │   │   │   ├── ...
│       │   │   │   │   ├── flow_y_00001.jpg
│       │   │   │   │   ├── flow_y_00002.jpg
│       │   │   │   │   ├── ...
│       │   │   │   │   ├── img_00001.jpg
│       │   │   │   │   └── img_00002.jpg
│       │   │   │   └── yt-zxQfALnTdfc_56
│       │   │   │   │   ├── ...
│       │   │   └── yawning
│       │   │       ├── _8zmP1e-EjU_2
│       │   │       │   ├── ...
│       │   └── validation
│       │   │       ├── ...
│       └── videos
│           ├── training
│           │   ├── adult+female+singing
│           │   │   ├── 0P3XG_vf91c_35.mp4
│           │   │   ├── ...
│           │   │   └── yt-zxQfALnTdfc_56.mp4
│           │   └── yawning
│           │       ├── ...
│           └── validation
│           │   ├── ...
└── mmaction
└── ...

For training and evaluating on Moments in Time, please refer to getting_started.md.

Multi-Moments in Time

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/mmit/.

Step 1. Prepare Annotations and Videos

First of all, you can run the following script to prepare annotations.

bash download_data.sh

For better decoding speed, you can resize the original videos into smaller sized, densely encoded version by:

python ../resize_videos.py ../../../data/mmit/videos/ ../../../data/mmit/videos_256p_dense_cache --dense --level 2

Step 2. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

First, you can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/mmit_extracted/
ln -s /mnt/SSD/mmit_extracted/ ../../../data/mmit/rawframes

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames using “tvl1” algorithm.

bash extract_frames.sh

Step 3. Generate File List

you can run the follow script to generate file list in the format of rawframes or videos.

bash generate_rawframes_filelist.sh
bash generate_videos_filelist.sh

Step 4. Check Directory Structure

After the whole data process for Multi-Moments in Time preparation, you will get the rawframes (RGB + Flow), videos and annotation files for Multi-Moments in Time.

In the context of the whole project (for Multi-Moments in Time only), the folder structure will look like:

mmaction2/
└── data
    └── mmit
        ├── annotations
        │   ├── moments_categories.txt
        │   ├── trainingSet.txt
        │   └── validationSet.txt
        ├── mmit_train_rawframes.txt
        ├── mmit_train_videos.txt
        ├── mmit_val_rawframes.txt
        ├── mmit_val_videos.txt
        ├── rawframes
        │   ├── 0-3-6-2-9-1-2-6-14603629126_5
        │   │   ├── flow_x_00001.jpg
        │   │   ├── flow_x_00002.jpg
        │   │   ├── ...
        │   │   ├── flow_y_00001.jpg
        │   │   ├── flow_y_00002.jpg
        │   │   ├── ...
        │   │   ├── img_00001.jpg
        │   │   └── img_00002.jpg
        │   │   ├── ...
        │   └── yt-zxQfALnTdfc_56
        │   │   ├── ...
        │   └── ...

        └── videos
            └── adult+female+singing
                ├── 0-3-6-2-9-1-2-6-14603629126_5.mp4
                └── yt-zxQfALnTdfc_56.mp4
            └── ...

For training and evaluating on Multi-Moments in Time, please refer to getting_started.md.

Something-Something V1

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/sthv1/.

Step 1. Prepare Annotations

First of all, you have to sign in and download annotations to $MMACTION2/data/sthv1/annotations on the official website.

Step 2. Prepare RGB Frames

Since the sthv1 website doesn’t provide the original video data and only extracted RGB frames are available, you have to directly download RGB frames from sthv1 website.

You can download all RGB frame parts on sthv1 website to $MMACTION2/data/sthv1/ and use the following command to extract.

cd $MMACTION2/data/sthv1/
cat 20bn-something-something-v1-?? | tar zx
cd $MMACTION2/tools/data/sthv1/

For users who only want to use RGB frames, you can skip to step 5 to generate file lists in the format of rawframes. Since the prefix of official JPGs is “%05d.jpg” (e.g., “00001.jpg”), we add “filename_tmpl=’{:05}.jpg’” to the dict of data.train, data.val and data.test in the config files related with sthv1 like this:

data = dict(
    videos_per_gpu=16,
    workers_per_gpu=4,
    train=dict(
        type=dataset_type,
        ann_file=ann_file_train,
        data_prefix=data_root,
        filename_tmpl='{:05}.jpg',
        pipeline=train_pipeline),
    val=dict(
        type=dataset_type,
        ann_file=ann_file_val,
        data_prefix=data_root_val,
        filename_tmpl='{:05}.jpg',
        pipeline=val_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=ann_file_test,
        data_prefix=data_root_val,
        filename_tmpl='{:05}.jpg',
        pipeline=test_pipeline))

Step 3. Extract Flow

This part is optional if you only want to use RGB frames.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance.

You can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/sthv1_extracted/
ln -s /mnt/SSD/sthv1_extracted/ ../../../data/sthv1/rawframes

Then, you can run the following script to extract optical flow based on RGB frames.

cd $MMACTION2/tools/data/sthv1/
bash extract_flow.sh

Step 4. Encode Videos

This part is optional if you only want to use RGB frames.

You can run the following script to encode videos.

cd $MMACTION2/tools/data/sthv1/
bash encode_videos.sh

Step 5. Generate File List

You can run the follow script to generate file list in the format of rawframes.

cd $MMACTION2/tools/data/sthv1/
bash generate_rawframes_filelist.sh

Step 5. Check Directory Structure

After the whole data process for Something-Something V1 preparation, you will get the rawframes (RGB + Flow), and annotation files for Something-Something V1.

In the context of the whole project (for Something-Something V1 only), the folder structure will look like:

mmaction
├── mmaction
├── tools
├── configs
├── data
│   ├── sthv1
│   │   ├── sthv1_{train,val}_list_rawframes.txt
│   │   ├── annotations
│   |   ├── rawframes
│   |   |   ├── 100000
│   |   |   |   ├── img_00001.jpg
│   |   |   |   ├── img_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_x_00001.jpg
│   |   |   |   ├── flow_x_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_y_00001.jpg
│   |   |   |   ├── flow_y_00002.jpg
│   |   |   |   ├── ...
│   |   |   ├── 100001
│   |   |   ├── ...

For training and evaluating on Something-Something V1, please refer to getting_started.md.

Something-Something V2

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/sthv2/.

Step 1. Prepare Annotations

First of all, you have to sign in and download annotations to $MMACTION2/data/sthv2/annotations on the official website.

Step 2. Prepare Videos

Then, you can download all data parts to $MMACTION2/data/sthv2/ and use the following command to extract.

cd $MMACTION2/data/sthv2/
cat 20bn-something-something-v2-?? | tar zx

Step 3. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance.

You can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/sthv2_extracted/
ln -s /mnt/SSD/sthv2_extracted/ ../../../data/sthv2/rawframes

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

cd $MMACTION2/tools/data/sthv2/
bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

cd $MMACTION2/tools/data/sthv2/
bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames.

cd $MMACTION2/tools/data/sthv2/
bash extract_frames.sh

Step 4. Generate File List

you can run the follow script to generate file list in the format of rawframes and videos.

cd $MMACTION2/tools/data/sthv2/
bash generate_{rawframes, videos}_filelist.sh

Step 5. Check Directory Structure

After the whole data process for Something-Something V2 preparation, you will get the rawframes (RGB + Flow), videos and annotation files for Something-Something V2.

In the context of the whole project (for Something-Something V2 only), the folder structure will look like:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── sthv2
│   │   ├── sthv2_{train,val}_list_rawframes.txt
│   │   ├── sthv2_{train,val}_list_videos.txt
│   │   ├── annotations
│   |   ├── videos
│   |   |   ├── 100000.mp4
│   |   |   ├── 100001.mp4
│   |   |   ├──...
│   |   ├── rawframes
│   |   |   ├── 100000
│   |   |   |   ├── img_00001.jpg
│   |   |   |   ├── img_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_x_00001.jpg
│   |   |   |   ├── flow_x_00002.jpg
│   |   |   |   ├── ...
│   |   |   |   ├── flow_y_00001.jpg
│   |   |   |   ├── flow_y_00002.jpg
│   |   |   |   ├── ...
│   |   |   ├── 100001
│   |   |   ├── ...

For training and evaluating on Something-Something V2, please refer to getting_started.md.

THUMOS’14

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/thumos14/.

Step 1. Prepare Annotations

First of all, run the following script to prepare annotations.

cd $MMACTION2/tools/data/thumos14/
bash download_annotations.sh

Step 2. Prepare Videos

Then, you can run the following script to prepare videos.

cd $MMACTION2/tools/data/thumos14/
bash download_videos.sh

Step 3. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance.

You can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/thumos14_extracted/
ln -s /mnt/SSD/thumos14_extracted/ ../data/thumos14/rawframes/

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

cd $MMACTION2/tools/data/thumos14/
bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

cd $MMACTION2/tools/data/thumos14/
bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames.

cd $MMACTION2/tools/data/thumos14/
bash extract_frames.sh tvl1

.

Step 4. Fetch File List

This part is optional if you do not use SSN model.

You can run the follow script to fetch pre-computed tag proposals.

cd $MMACTION2/tools/data/thumos14/
bash fetch_tag_proposals.sh

Step 5. Denormalize Proposal File

This part is optional if you do not use SSN model.

You can run the follow script to denormalize pre-computed tag proposals according to actual number of local rawframes.

cd $MMACTION2/tools/data/thumos14/
bash denormalize_proposal_file.sh

Step 6. Check Directory Structure

After the whole data process for THUMOS’14 preparation, you will get the rawframes (RGB + Flow), videos and annotation files for THUMOS’14.

In the context of the whole project (for THUMOS’14 only), the folder structure will look like:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── thumos14
│   │   ├── proposals
│   │   |   ├── thumos14_tag_val_normalized_proposal_list.txt
│   │   |   ├── thumos14_tag_test_normalized_proposal_list.txt
│   │   ├── annotations_val
│   │   ├── annotations_test
│   │   ├── videos
│   │   │   ├── val
│   │   │   |   ├── video_validation_0000001.mp4
│   │   │   |   ├── ...
│   │   |   ├── test
│   │   │   |   ├── video_test_0000001.mp4
│   │   │   |   ├── ...
│   │   ├── rawframes
│   │   │   ├── val
│   │   │   |   ├── video_validation_0000001
|   │   │   |   │   ├── img_00001.jpg
|   │   │   |   │   ├── img_00002.jpg
|   │   │   |   │   ├── ...
|   │   │   |   │   ├── flow_x_00001.jpg
|   │   │   |   │   ├── flow_x_00002.jpg
|   │   │   |   │   ├── ...
|   │   │   |   │   ├── flow_y_00001.jpg
|   │   │   |   │   ├── flow_y_00002.jpg
|   │   │   |   │   ├── ...
│   │   │   |   ├── ...
│   │   |   ├── test
│   │   │   |   ├── video_test_0000001

For training and evaluating on THUMOS’14, please refer to getting_started.md.

UCF-101

For basic dataset information, you can refer to the dataset website. Before we start, please make sure that the directory is located at $MMACTION2/tools/data/ucf101/.

Step 1. Prepare Annotations

First of all, you can run the following script to prepare annotations.

bash download_annotations.sh

Step 2. Prepare Videos

Then, you can run the following script to prepare videos.

bash download_videos.sh

For better decoding speed, you can resize the original videos into smaller sized, densely encoded version by:

python ../resize_videos.py ../../../data/ucf101/videos/ ../../../data/ucf101/videos_256p_dense_cache --dense --level 2 --ext avi

Step 3. Extract RGB and Flow

This part is optional if you only want to use the video loader.

Before extracting, please refer to install.md for installing denseflow.

If you have plenty of SSD space, then we recommend extracting frames there for better I/O performance. The extracted frames (RGB + Flow) will take up about 100GB.

You can run the following script to soft link SSD.

### execute these two line (Assume the SSD is mounted at "/mnt/SSD/")
mkdir /mnt/SSD/ucf101_extracted/
ln -s /mnt/SSD/ucf101_extracted/ ../../../data/ucf101/rawframes

If you only want to play with RGB frames (since extracting optical flow can be time-consuming), consider running the following script to extract RGB-only frames using denseflow.

bash extract_rgb_frames.sh

If you didn’t install denseflow, you can still extract RGB frames using OpenCV by the following script, but it will keep the original size of the images.

bash extract_rgb_frames_opencv.sh

If both are required, run the following script to extract frames using “tvl1” algorithm.

bash extract_frames.sh

Step 4. Generate File List

you can run the follow script to generate file list in the format of rawframes and videos.

bash generate_videos_filelist.sh
bash generate_rawframes_filelist.sh

Step 5. Check Directory Structure

After the whole data process for UCF-101 preparation, you will get the rawframes (RGB + Flow), videos and annotation files for UCF-101.

In the context of the whole project (for UCF-101 only), the folder structure will look like:

mmaction2
├── mmaction
├── tools
├── configs
├── data
│   ├── ucf101
│   │   ├── ucf101_{train,val}_split_{1,2,3}_rawframes.txt
│   │   ├── ucf101_{train,val}_split_{1,2,3}_videos.txt
│   │   ├── annotations
│   │   ├── videos
│   │   │   ├── ApplyEyeMakeup
│   │   │   │   ├── v_ApplyEyeMakeup_g01_c01.avi

│   │   │   ├── YoYo
│   │   │   │   ├── v_YoYo_g25_c05.avi
│   │   ├── rawframes
│   │   │   ├── ApplyEyeMakeup
│   │   │   │   ├── v_ApplyEyeMakeup_g01_c01
│   │   │   │   │   ├── img_00001.jpg
│   │   │   │   │   ├── img_00002.jpg
│   │   │   │   │   ├── ...
│   │   │   │   │   ├── flow_x_00001.jpg
│   │   │   │   │   ├── flow_x_00002.jpg
│   │   │   │   │   ├── ...
│   │   │   │   │   ├── flow_y_00001.jpg
│   │   │   │   │   ├── flow_y_00002.jpg
│   │   │   ├── ...
│   │   │   ├── YoYo
│   │   │   │   ├── v_YoYo_g01_c01
│   │   │   │   ├── ...
│   │   │   │   ├── v_YoYo_g25_c05

For training and evaluating on UCF-101, please refer to getting_started.md.