Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
collect_samples.py	collect_samples.py
create_index_files.py	create_index_files.py
download.py	download.py
download_ffmpeg.py	download_ffmpeg.py
extract_audio.py	extract_audio.py
extract_frames.py	extract_frames.py
get_label_maps.py	get_label_maps.py
preprocess.py	preprocess.py
reorganize.py	reorganize.py
utils.py	utils.py
vggsound.csv	vggsound.csv

Name

Last commit message

Last commit date

README.md

collect_samples.py

create_index_files.py

Downloading the VGGSound Dataset

This folder contains the scripts for downloading the VGGSound dataset. The CSV file is downloaded from the original repository.

Prerequisites

Shuffle and split the CSV file as follows. Put vggsound.csv into your data directory, e.g. data/vggsound.

shuf data/vggsound/vggsound.csv > data/vggsound/vggsound-shuf.csv
split -l 10000 -d --additional-suffix=.csv data/vggsound/vggsound-shuf.csv data/vggsound/vggsound-shuf-

Install packeages

pip install youtube_dl tqdm pafy

Download the dataset

Run the following script over all the CSV files.

python download_ffmpeg.py -e -s -i data/vggsound/vggsound-shuf-00.csv -o data/vggsound/video/00/

Extract audio from videos

python extract_audio.py -i data/vggsound/video -o data/vggsound/audio -s -e

Extract image frames from videos

python extract_frames.py -i data/vggsound/video -o data/vggsound/frames -s -e

Resize and crop images

python preprocess.py -i data/vggsound/frames -o data/vggsound/preprocessed -s -e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Downloading the VGGSound Dataset

Prerequisites

Download the dataset

FilesExpand file tree

vggsound

Directory actions

More options

Directory actions

More options

Latest commit

History

vggsound

Folders and files

parent directory

README.md

Downloading the VGGSound Dataset

Prerequisites

Download the dataset