Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
i2v_sampled_videos.txt	i2v_sampled_videos.txt
sampled_videos.txt	sampled_videos.txt

Sampled Videos

To facilitate future research and to ensure full transparency, we release all the videos we sampled and used for VBench evaluation. You can download them on Google Drive.

What Videos Do We Provide?

8 T2V Models:
- including lavie, modelscope, cogvideo, videocrafter-0.9, videocrafter-1, show-1, pika, gen-2. More details of models are provided below.
2 Suites of Videos for each Model:
- Per Dimension: The sampled videos for each ability dimension evaluated by VBench. The per-dimension prompts are available under prompts/prompts_per_dimension, and we also provide a combined list of all the dimensions' prompts at prompts/all_dimension.txt.
- Per Category: The sampled videos for each ability dimension evaluated by VBench. The per-dimension prompts are available under prompts/prompts_per_category, and we also provide a combined list of all the dimensions' prompts at prompts/all_category.txt.

What's the potential usage of these videos:

Further labeling on video quality
For Instruction Tuning, using our videos and our human preference labels

Below is the folder structure of different models' sampled videos:

t2v_sampled_videos
├── per_dimension
│   ├── cogvideo.zip
│   ├── gen-2-all-dimension.tar.gz
│   ├── lavie.zip
│   ├── modelscope.zip
│   ├── opensora.tar
│   ├── pika-all-dimension.zip
│   ├── show-1.tar.gz
│   ├── videocrafter-1.tar.gz
│   ├── videocrafter-2.tar
│   └── videocrafter-09.zip
└── per_category
    ├── cogvideo.zip
    ├── gen-2-all-category.tar.gz
    ├── lavie.zip
    ├── modelscope.zip
    ├── pika-all-category.zip
    ├── show-1.tar.gz
    ├── videocrafter-0.9.zip
    └── videocrafter-1.zip

How to Download the Videos?

You can utilize gdown to download from Google Drive. Below is an example:

First, install gdown:

pip install gdown

Then, download zip file using gdown:

gdown --id <file_id> --output <output_filename>

# Example for videocrafter-1
gdown --id 1FCRj48-Yv7LM7XGgfDCvIo7Kb9EId5KX --output videocrafter-1.tar.gz

What are the Details of the Video Generation Models?

We list the setting for sampling videos from these models.

Model	Evaluation Party	Release Time	Resolution	FPS	Frame Count	Video Length	Checkpoint	Code Commit ID	Video Format	Sampled Videos (Dimension)	Sampled Videos (Category)	Other Settings
`LaVie`	VBench Team	2023-09-26	512x512	8	16	2.0s	-	-	MP4	Google Drive	Google Drive
`LaVie-Interpolation`	VBench Team	2023-09-26	512x512	24	61	2.5s	link	-	MP4	Google Drive	-
`ModelScope`	VBench Team	2023-08-12	256x256	8	16	2.0s	link	-	MP4	Google Drive	Google Drive
`CogVideo`	VBench Team	2022-05-29	480x480	10	33	3.3s	link	-	GIF	Google Drive	Google Drive
`VideoCrafter-0.9`	VBench Team	2023-04-05	256x256	8	16	2.0s	link	Commit ID	MP4	Google Drive	Google Drive
`VideoCrafter-1.0`	VBench Team	2023-10-30	1024x576	10	16	1.6s	link	Commit ID	MP4	Google Drive	Google Drive
`Show-1`	VBench Team	2023-09-27	576x320	8	29	3.6s	link	Commit ID	MP4	Google Drive	Google Drive
`Gen-2`	VBench Team	2023-06-07	1408x768	24	96	4.0s	-	-	MP4	Google Drive	Google Drive
`Pika`	VBench Team	2023-06-29	1088x640	24	72	3.0s	-	-	MP4	Google Drive	Google Drive
`Open-Sora`	VBench Team	2024-03-18	512x512	8	16	2.0s	link	Commit ID	MP4	Google Drive	-
`VideoCrafter-2.0`	VBench Team	2024-01-18	320x512	10	16	1.6s	link	Commit ID	MP4	Google Drive	-
`T2V-Turbo (VC2)`	T2V-Turbo Team	2024-05-29	320x512	16	16	1.0s	link	Commit ID	MP4	-	-	`unet_lora.pt` is used to turn VideoCrafter-2.0 to `T2V-Turbo (VC2)`
`AnimateDiff-V1`	VBench Team	2023-07-18	512x512	8	16	2.0s	T2I backbone SD1.5, Motion Module, LoRA(Realistic Vision 2.0)	Commit ID	MP4	Google Drive	-	Negative Prompt We apply the same negative prompt during sampling for all videos: `semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck`
`AnimateDiff-V2`	VBench Team	2023-09-10	512x512	8	16	2.0s	T2I backbone SD1.5, Motion Module, LoRA	Commit ID	MP4	Google Drive	-	Negative Prompt We apply the same negative prompt during sampling for all videos: `semi-realistic, cgi, 3d, render, sketch, cartoon, drawing, anime, text, close up, cropped, out of frame, worst quality, low quality, jpeg artifacts, ugly, duplicate, morbid, mutilated, extra fingers, mutated hands, poorly drawn hands, poorly drawn face, mutation, deformed, blurry, dehydrated, bad anatomy, bad proportions, extra limbs, cloned face, disfigured, gross proportions, malformed limbs, missing arms, missing legs, extra arms, extra legs, fused fingers, too many fingers, long neck`
`Latte-1`	VBench Team	2024-05-23	512x512	8	16	2.0s	link	Commit ID	MP4	Google Drive	-
`OpenSora V1.2 (2s)`	OpenSora Team	2024-06-28	854×480	24	51	2s	link	-	MP4	link	-	eval results & info provided by OpenSora Team
`HiGen`	VBench Team	2024-03-08	448x256	8	32	4.0s	link	Commit ID	MP4	Google Drive	-
`TF-T2V`	VBench Team	2024-04-03	448x256	8	32	4.0s	link	Commit ID	MP4	Google Drive	-
`AnimateLCM`	VBench Team	2024-02-26	512x512	8	16	2.0s	link	Commit ID	MP4	Google Drive	-	Negative Prompt We apply the same negative prompt during sampling for all videos: `bad quality, worse quality, low resolution`
`InstructVideo(ModelScope)`	VBench Team	2024-06-17	256x256	8	16	2.0s	link	Commit ID	MP4	Google Drive	-
`OpenSora V1.1`	VBench Team	2024-04-25	424x240	8	64	8.0s	link	Commit ID	MP4	Google Drive	-
`OpenSoraPlan V1.1`	VBench Team	2024-05-27	512x512	24	221	9.2s	link	Commit ID	MP4	Google Drive	-
`Mira`	VBench Team	2024-04-01	384x240	6	60	10.0s	link	Commit ID	MP4	Google Drive	-
`Pika 1.0`	VBench Team	2023-12-28	1280x720	24	72	3.0s	-	-	MP4	Google Drive	Google Drive
`Gen-3`	VBench Team	2024-06-17	1280x768	24	256	10.7s	-	-	MP4	Google Drive	Google Drive
`Kling`	VBench Team	2024-06-06	1280x720	30	153	5.1s	-	-	MP4	Google Drive	-	high-performance mode (lower sampling cost), not high-quality mode (better quality)
`Data-Juicer (T2V-Turbo)`	Data-Juicer Team	2024-07-23	320x512	8	16	2.0s	-	-	MP4	-	-	from Data-Juicer Team: based on T2V-Turbo, with Data-Juicer's data and loss enhancement
`LaVie-2`	LaVie-2 Team	-	512x512	8	16	2.0s	-	-	MP4	-	-	info provided by LaVie-2 Team
`CogVideoX-2B (SAT, prompt-optimized)`	VBench Team	2024-08-06	720x480	8	49	6.1s	link	Commit ID	MP4	Google Drive	Google Drive	applied augmented prompts
`OpenSora V1.2 (8s)`	VBench Team	2024-06-17	1280x720	24	204	8.5s	link	Commit ID	MP4	Google Drive	Google Drive
`CogVideoX-5B (SAT, prompt-optimized)`	VBench Team	2024-08-27	720x480	8	49	6.1s	link	Commit ID	MP4	Google Drive	-	applied augmented prompts
`Vchitect-2.0-2B`	VBench Team	2024-09-14	768x432	8	40	5.0s	link	Commit ID	MP4	Google Drive	-	-
`Vchitect-2.0 (VEnhancer)`	VBench Team	2024-09-14	1920x1080	16	79	4.9s	-	Commit ID	MP4	Google Drive	-	-
`JT-CV-9B`	JiuTianCV Team	2024-09-24	2158x1214	24	51	2.1s	-	-	MP4	-	-	-
`Data-Juicer (2024-09-23, T2V-Turbo)`	Data-Juicer Team	2024-09-23	512*320	8	16	2.0s	link	-	MP4	-	-	from Data-Juicer Team: based on T2V-Turbo, with Data-Juicer's data and loss enhancement
`MiniMax-Video-01`	VBench Team	2024-10-01	1280x720	25	141	5.6s	-	-	MP4	Google Drive	-	-
`T2V-Turbo-v2`	T2V-Turbo Team	2024-10-02	320x512	16	8	2.0s	-	-	-	-	-	-
`OpenSoraPlan V1.2`	VBench Team	2024-07-24	1280x720	24	93	3.9s	link	Commit ID	MP4	Google Drive	-
`OpenSoraPlan V1.3`	VBench Team	2024-10-16	640x352	18	93	5.2s	link	Commit ID	MP4	Google Drive	-	Prompt refiner provided by OpenSoraPlanv1.3 is used. First, download the weights and set path in config. Then, use the original VBench prompts as input. The code will automatically process them and feed the refined prompts into the model.
`Mochi-1`	VBench Team	2024-10-22	848x480	30	163	5.4s	link	Commit ID	MP4	Google Drive	-	Default settings from Mochi demo are used
`CogVideoX1.5-5B (5s SAT prompt-optimized)`	VBench Team	2024-11-08	1360x768	16	84	5.3s	link	Commit ID	MP4	Google Drive	-	applied augmented prompts
`Vidu`	VBench Team	2024-07-30	688x384	16	124	7.8s	-	-	MP4	Google Drive	Google Drive	-
`TeleAI-VAST`	TeleAI	2024-12-02	480x720	5	25	5.0s	-	-	MP4	-	-	info provided by TeleAI
`HunyuanVideo (Open-Source Version)`	VBench Team	2024-12-03	1280x720	24	129	5.4s	link	Commit ID	MP4	Google Drive	-	applied Prompt Rewrite provided by HunyuanVideo, prompt list
`Jimeng`	VBench Team	2024-05-09	1280x720	8	96	12.0s	-	-	MP4	Google Drive	Google Drive	-
`LTX-video (5s 768×512)`	VBench Team	2024-11-22	768×512	25	121	4.8s	link	Commit ID	MP4	Google Drive	-	applied augmented prompts
`CausVid`	VBench Team	2024-12-07	640x352	12	120	10.0s	-	-	MP4	Google Drive	-	-
`STIV (Apple)`	VBench Team	2024-12-19	512x512	60	60	1.0s	-	-	MP4	Google Drive	-	-
`CausVid (2025-01-02 5s)`	VBench Team	2025-01-02	640x352	24	120	5.0s	-	-	MP4	Google Drive	-	-
`Wan2.1`	VBench Team	2025-01-08	1280x720	16	80	5.0s	-	-	MP4	Google Drive	-	-
`Luma`	VBench Team	2024-06-13	1360x752	24	121	5.0s	-	-	MP4	Google Drive	Google Drive	-
`RepVideo`	VBench Team	2025-01-16	720x480	8	49	6.1s	-	Code	MP4	Google Drive	-	-
`MiracleVision V5`	VBench Team	2025-01-21	720x480	24	120	5.0s	-	-	MP4	Google Drive	-	-
`Sora`	VBench Team	2025-01-14	854x480	30	150	5.0s	-	-	MP4	Google Drive	-	-
`EasyAnimateV5.1`	VBench Team	2025-01-22	672x384	8	49	6.0s	-	Code	MP4	Google Drive	-	-
`Wan2.1(2025-02-24)`	VBench Team	2025-02-24	1280x720	16	80	5.0s	-	-	MP4	Google Drive	-	-
`IPOC`	VBench Team	2025-02-28	1360x768	16	81	5.0s	-	-	MP4	Google Drive	-	-
`CogVideoX-2B (Diffusers)`	VBench Team	2025-03-03	720x480	8	49	6.1s	-	-	MP4	Google Drive	-	applied augmented prompts
`CogVideoX-5B (Diffusers)`	VBench Team	2025-03-04	720x480	8	49	6.1s	-	-	MP4	Google Drive	-	applied augmented prompts
`Step-Video-T2V`	VBench Team	2025-03-13	992x544	25	200	8s	-	-	MP4	Google Drive	-
`Open-Sora-2.0`	VBench Team	2025-03-14	1024x576	24	120	5s	-	-	MP4	Google Drive	-
`Wan2.1-T2V-1.3B`	VBench Team	2025-03-20	832x480	16	81	5s	-	-	MP4	Google Drive	-	applied augmented prompts
`Wan2.1-T2V-1.3B`	VBench Team	2025-03-20	832x480	16	81	5s	-	-	MP4	Google Drive	-
`Open-Sora 2.0 (2025-03-18)`	VBench Team	2025-03-31	1024x576	24	120	5s	-	-	MP4	Google Drive	-
`AccVideo`	VBench Team	2025-03-31	960x544	24	72	3s	-	-	MP4	Google Drive	-
`IPOC (2025-04-14)`	VBench Team	2025-04-14	1360x768	16	81	5s	-	-	MP4	Google Drive	-
`Vidu Q1 (2025-04-17)`	VBench Team	2025-04-21	1280x720	24	125	5.2s	-	-	MP4	Google Drive	-
`CogVideoX1.5-5B`	VBench Team	2025-04-23	1360x768	16	161	10s	-	-	MP4	Google Drive	-	applied augmented prompts
`Wan2.1-T2V-1.3B (2025-05-03)`	VBench Team	2025-05-03	832x480	16	81	5s	-	-	MP4	Google Drive	-	applied augmented prompts guidance_scale=6.0, flow_shift=3.0, num_inference_steps=50, sampler=unipc, negative prompt='色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走
`Kling-1.6`	VBench Team	2025-05-08	1280x720	24	216	9s	-	-	MP4	Google Drive	-	applied augmented prompts
`Hunyuan Video (2025-05-22)`	VBench Team	2025-05-22	1280x720	24	129	5.4s	-	-	MP4	Google Drive	-	applied Prompt Rewrite provided by HunyuanVideo, prompt list
`MAGI-T2V-4.5B-distill`	VBench Team	2025-06-11	720x720	24	96	4s	-	-	MP4	Google Drive	-
`Wan2.1-T2V-14B`	VBench Team	2025-07-25	1280x720	16	81	5s	-	-	MP4	Google Drive	-	applied augmented prompts
`JT-CV-9B`	VBench Team	2025-07-30	2158 × 1214	24	51	2.1s	-	-	MP4	Google Drive	-	applied augmented prompts
`LanDiff`	VBench Team	2025-08-06	720x480	8	49	6s	-	-	MP4	Google Drive	-
`IPOW`	VBench Team	2025-08-06	832x480	16	81	5s	-	-	MP4	Google Drive	-
`Veo 3`	VBench Team	2025-08-06	1280×720	24	192	8s	-	-	MP4	Google Drive	-
`MAGI-T2V-24B-distill`	VBench Team	2025-08-06	1280×720	24	96	4s	-	-	MP4	Google Drive	-
`LTX-2 (Diffusers) (w/o prompt extend)`	VBench Team	2026-02-04	768×512	24	121	5.0s	LTX-2	Commit ID	MP4	Google Drive	-	Videos sampled using TI2VidTwoStagesPipeline and ltx-2-19b-dev.safetensors checkpoint
`Wan2.2-T2V-A14B (w/o prompt extend)`	VBench Team	2026-02-18	1280x720	16	81	5.1s	Wan2.2-T2V-A14B	Commit ID	MP4	Google Drive	-
`Wan2.2-T2V-A14B (Qwen prompt extend)`	VBench Team	2026-02-18	1280x720	16	81	5.1s	Wan2.2-T2V-A14B	Commit ID	MP4	Google Drive	-

How are Files Structured in Google Drive?

1. Sub-Folder Organization

For these models,

(1) The per_dimension zip contains 11 subfolders corresponding to videos sampled for evaluating different dimensions.
(1) The per_category zip contains 8 subfolders corresponding to videos sampled for evaluating different content categories.

1.1. Single-Stage Outputs

For LaVie, ModelScope, CogVideo, VideoCrafter-0.9, Open-Sora, VideoCrafter-2.0, AnimateDiff-V2, we provide their single-stage outputs.

We take LaVie as an example:

- per_dimension
    - lavie
        - appearance_style   
            - The bund Shanghai, Van Gogh style-0.mp4
            - The bund Shanghai, Van Gogh style-1.mp4
            - ...
        - human_action
            - A person is finger snapping-0.mp4
            - A person is finger snapping-1.mp4
            - ...
        - object_class
            - a dining table-0.mp4
            - a dining table-1.mp4
            - ...
        - scene
            - restaurant-0.mp4
            - restaurant-1.mp4
            - ...
        - subject_consistency
            - a giraffe taking a peaceful walk-0.mp4
            - a giraffe taking a peaceful walk-1.mp4
            - ...
        - temporal_style
            - The bund Shanghai, zoom in-0.mp4
            - The bund Shanghai, zoom in-1.mp4
            - ...
        - color
            - a blue clock-0.mp4
            - a blue clock-1.mp4
            - ...
        - multiple_objects
            - a fire hydrant and a stop sign-0.mp4
            - a fire hydrant and a stop sign-1.mp4
            - ...
        - overall_consistency
            - Yellow flowers swing in the wind-0.mp4
            - Yellow flowers swing in the wind-1.mp4
            - ...
        - spatial_relationship
            - a frisbee on the left of a sports ball, front view-0.mp4
            - a frisbee on the left of a sports ball, front view-1.mp4
            - ...
        - temporal_flickering
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-0.mp4
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-1.mp4
            - ...
- per_category
    - lavie # or modelscope, cogvideo, videocrafter-0.9
        - animal  
            - wild rabbit in a green meadow-0.mp4
            - wild rabbit in a green meadow-1.mp4
            - ...
        - architecture
            - water tower on the desert-0.mp4
            - water tower on the desert-1.mp4
            - ...
        - food
            - waffles with whipped cream and fruit-0.mp4
            - waffles with whipped cream and fruit-1.mp4
            - ...
        - human
            - young dancer practicing at home-0.mp4
            - young dancer practicing at home-1.mp4
            - ...
        - lifestyle
            - the interior design of a shopping mall-0.mp4
            - the interior design of a shopping mall-1.mp4
            - ...
        - plant
            - coconut tree near sea under blue sky-0.mp4
            - coconut tree near sea under blue sky-1.mp4
            - ...
        - scenery
            - waterfalls in between mountain-0.mp4
            - waterfalls in between mountain-1.mp4
            - ...
        - vehicles
            - video of yacht sailing in the ocean-0.mp4
            - video of yacht sailing in the ocean-1.mp4
            - ...

1.2. Multi-Stage Outputs (Show-1)

For show-1, there are two folders corresponding to the last two stages of show-1 generated videos, namely super1 and super2. The leaderboard results correspond to evaluation on the final stage, namely super2.

- per_dimension
    - show-1
        - appearance_style/{super1/super2}       # subfolder super1 or super2
            - The bund Shanghai, Van Gogh style-0.mp4
            - The bund Shanghai, Van Gogh style-1.mp4
            - ...
        - human_action/{super1/super2}
            - A person is finger snapping-0.mp4
            - A person is finger snapping-1.mp4
            - ...
        - object_class/{super1/super2}
            - a dining table-0.mp4
            - a dining table-1.mp4
            - ...
        - scene/{super1/super2}
            - restaurant-0.mp4
            - restaurant-1.mp4
            - ...
        - subject_consistency/{super1/super2}
            - a giraffe taking a peaceful walk-0.mp4
            - a giraffe taking a peaceful walk-1.mp4
            - ...
        - temporal_style/{super1/super2}
            - The bund Shanghai, zoom in-0.mp4
            - The bund Shanghai, zoom in-1.mp4
            - ...
        - color/{super1/super2}
            - a blue clock-0.mp4
            - a blue clock-1.mp4
            - ...
        - multiple_objects/{super1/super2}
            - a fire hydrant and a stop sign-0.mp4
            - a fire hydrant and a stop sign-1.mp4
            - ...
        - overall_consistency/{super1/super2}
            - Yellow flowers swing in the wind-0.mp4
            - Yellow flowers swing in the wind-1.mp4
            - ...
        - spatial_relationship/{super1/super2}
            - a frisbee on the left of a sports ball, front view-0.mp4
            - a frisbee on the left of a sports ball, front view-1.mp4
            - ...
        - temporal_flickering/{super1/super2}
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-0.mp4
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-1.mp4
            - ...
- per_category
    - show-1
        - animal/{super1/super2}
            - wild rabbit in a green meadow-0.mp4
            - wild rabbit in a green meadow-1.mp4
            - ...
        - architecture/{super1/super2}
            - water tower on the desert-0.mp4
            - water tower on the desert-1.mp4
            - ...
        - food/{super1/super2}
            - waffles with whipped cream and fruit-0.mp4
            - waffles with whipped cream and fruit-1.mp4
            - ...
        - human/{super1/super2}
            - young dancer practicing at home-0.mp4
            - young dancer practicing at home-1.mp4
            - ...
        - lifestyle/{super1/super2}
            - the interior design of a shopping mall-0.mp4
            - the interior design of a shopping mall-1.mp4
            - ...
        - plant/{super1/super2}
            - coconut tree near sea under blue sky-0.mp4
            - coconut tree near sea under blue sky-1.mp4
            - ...
        - scenery/{super1/super2}
            - waterfalls in between mountain-0.mp4
            - waterfalls in between mountain-1.mp4
            - ...
        - vehicles/{super1/super2}
            - video of yacht sailing in the ocean-0.mp4
            - video of yacht sailing in the ocean-1.mp4
            - ...

1.3. Multi-Resolution Outputs (VideoCrafter-1)

Under each dimension or category in videocrafter-1, there are two folders corresponding to the two resolution options for videocrafter-1 generated videos, namely 1024x576 and 512x320. The leaderboard currently contains the evaluation results for the 1024x576 resolution.

- per_dimension
    - videocrafter-1
        - appearance_style/{1024x576/512x320}       # subfolder 1024x576 or 512x320
            - The bund Shanghai, Van Gogh style-0.mp4
            - The bund Shanghai, Van Gogh style-1.mp4
            - ...
        - human_action/{1024x576/512x320}
            - A person is finger snapping-0.mp4
            - A person is finger snapping-1.mp4
            - ...
        - object_class/{1024x576/512x320}
            - a dining table-0.mp4
            - a dining table-1.mp4
            - ...
        - scene/{1024x576/512x320}
            - restaurant-0.mp4
            - restaurant-1.mp4
            - ...
        - subject_consistency/{1024x576/512x320}
            - a giraffe taking a peaceful walk-0.mp4
            - a giraffe taking a peaceful walk-1.mp4
            - ...
        - temporal_style/{1024x576/512x320}
            - The bund Shanghai, zoom in-0.mp4
            - The bund Shanghai, zoom in-1.mp4
            - ...
        - color/{1024x576/512x320}
            - a blue clock-0.mp4
            - a blue clock-1.mp4
            - ...
        - multiple_objects/{1024x576/512x320}
            - a fire hydrant and a stop sign-0.mp4
            - a fire hydrant and a stop sign-1.mp4
            - ...
        - overall_consistency/{1024x576/512x320}
            - Yellow flowers swing in the wind-0.mp4
            - Yellow flowers swing in the wind-1.mp4
            - ...
        - spatial_relationship/{1024x576/512x320}
            - a frisbee on the left of a sports ball, front view-0.mp4
            - a frisbee on the left of a sports ball, front view-1.mp4
            - ...
        - temporal_flickering/{1024x576/512x320}
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-0.mp4
            - static view on a desert scene with an oasis, palm trees, and a clear, calm pool of water-1.mp4
            - ...
- per_category
    - videocrafter-1
        - animal/{1024x576/512x320}
            - wild rabbit in a green meadow-0.mp4
            - wild rabbit in a green meadow-1.mp4
            - ...
        - architecture/{1024x576/512x320}
            - water tower on the desert-0.mp4
            - water tower on the desert-1.mp4
            - ...
        - food/{1024x576/512x320}
            - waffles with whipped cream and fruit-0.mp4
            - waffles with whipped cream and fruit-1.mp4
            - ...
        - human/{1024x576/512x320}
            - young dancer practicing at home-0.mp4
            - young dancer practicing at home-1.mp4
            - ...
        - lifestyle/{1024x576/512x320}
            - the interior design of a shopping mall-0.mp4
            - the interior design of a shopping mall-1.mp4
            - ...
        - plant/{1024x576/512x320}
            - coconut tree near sea under blue sky-0.mp4
            - coconut tree near sea under blue sky-1.mp4
            - ...
        - scenery/{1024x576/512x320}
            - waterfalls in between mountain-0.mp4
            - waterfalls in between mountain-1.mp4
            - ...
        - vehicles/{1024x576/512x320}
            - video of yacht sailing in the ocean-0.mp4
            - video of yacht sailing in the ocean-1.mp4
            - ...

2. Single-Folder Organization (Gen-2, Pika)

Gen-2 and Pika also include videos for "all_dimension" and "all_category", but we haven't divide the videos into subfolders according to specific dimensions or categories yet.

- per_dimension
    - gen-2
        - all_dimension
            - Yellow flowers swing in the wind-0.mp4
            - Yellow flowers swing in the wind-1.mp4
            - ...
    - pika
        - all_dimension
            - Yellow flowers swing in the wind-0.mp4
            - Yellow flowers swing in the wind-1.mp4
            - ...
- per_category
    - gen-2
        - all_category
            - young people celebrating new year at the office-0.mp4
            - young people celebrating new year at the office-1.mp4
            - ...
    - pika
        - all_category
            - young people celebrating new year at the office-0.mp4
            - young people celebrating new year at the office-1.mp4
            - ...

Human Preference Labels

Available for download at Google Drive.

Each dimension contains an annotation file, each of which contains a list, and the list contains manually preferred annotation results of videos generated by different prompts. The evaluation process involves comparing videos from different models and, based on human annotations, determining which video best matches the prompt for the corresponding dimension.

Data Structure

JSON data is composed of multiple objects, each representing an evaluation instance. Each instance contains the following key-value pairs:

prompt_en: The text prompt for generating the desired video content.

style_en/color_en/object_en ..: Dimension-related information.

question_en: The question asked to the human annotators / VLM.

videos: This section contains the urls to videos from different models.

human_anno: This section represents human annotation, which is composed of a nested dictionary. The outer keys represent the model names (e.g., "modelscope", "lavie"), and the inner keys represent the other model names. The corresponding values within these nested dictionaries represent the human-assigned scores for the relative quality of each model's video compared to the other model's video.

For example, human_anno["modelscope"]["lavie"] = 0 indicates that humans judged the Lavie video to be better than the Modelscope video for the given prompt and style.

human_anno["modelscope"]["videocraft"] = 1 indicates that humans judged the Modelscope video to be better than the Videocraft video.

human_anno["cogvideo"]["videocraft"] = 0.5 indicates that humans judged the Cogvideo video and the Videocraft video to be of equal quality.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Sampled Videos

What Videos Do We Provide?

How to Download the Videos?

What are the Details of the Video Generation Models?

How are Files Structured in Google Drive?

1. Sub-Folder Organization

1.1. Single-Stage Outputs

1.2. Multi-Stage Outputs (Show-1)

1.3. Multi-Resolution Outputs (VideoCrafter-1)

2. Single-Folder Organization (Gen-2, Pika)

Human Preference Labels

Data Structure

FilesExpand file tree

sampled_videos

Directory actions

More options

Directory actions

More options

Latest commit

History

sampled_videos

Folders and files

parent directory

README.md

Sampled Videos

What Videos Do We Provide?

How to Download the Videos?

What are the Details of the Video Generation Models?

How are Files Structured in Google Drive?

1. Sub-Folder Organization

1.1. Single-Stage Outputs

1.2. Multi-Stage Outputs (Show-1)

1.3. Multi-Resolution Outputs (VideoCrafter-1)

2. Single-Folder Organization (Gen-2, Pika)

Human Preference Labels

Data Structure