- C++ 52.5%
- C 40.6%
- Slang 6.2%
- Makefile 0.7%
| doc | ||
| examples | ||
| include | ||
| libvshipSpvShaders | ||
| src | ||
| test | ||
| .gitignore | ||
| LICENSE | ||
| Makefile | ||
| README.md | ||
Vship : Fast Metric Computation on GPU
An easy to use high-performance Library for GPU-accelerated visual fidelity metrics with SSIMULACRA2, Butteraugli & CVVDP.
This project has been 100% made by a human (no AI involved).
Overview
vship provides hardware-accelerated implementations of:
- SSIMULACRA2: A perceptual image quality metric from Cloudinary's SSIMULACRA2
- Butteraugli: Google's psychovisual image difference metric from libjxl
- CVVDP: University of Cambridge's psychovisual video quality metric
The plugin uses HIP/CUDA or Vulkan for GPU acceleration, providing significant performance improvements over CPU implementations. It can be used with a simple binary (FFVship), as a vapoursynth plugin and has a C API.
There are precompiled binaries ready to be used in the release section.
/!\ VulkanVship is very new and experimental, it tends to be slower and may be less reliable.
Projects Featuring Vship
If you want to use Vship with a pre-defined workflow, here are some projects featuring Vship:
- NVEnc, QSVEnc : Frameworks to test HW encoding quality from Rigaya
- Av1an: A Cross-platform command-line AV1 / VP9 / HEVC / H264 encoding framework with per scene quality encoding
- xav: A simpler alternative to Av1an dedicated to Target Quality Encoding trying to be as fast as possible.
- Auto-boost-essential : A python script improving efficiency of video encoding by varying parameters on scenes automatically.
- SSIMULACRApy: A Python script to compare videos and output their SSIMU2 scores using various metrics (by Kosaka)
metrics: A perceptual video metrics toolkit for video encoder developers (by the Psychovisual Experts Group)vs_align: A vapoursynth plugin measuring temporal offset between two videoschunknorris: A python script to adjust the quality encoding parameters at each scene of a video base on objective metrics
Installation
The steps to build vship from source are provided below.
See Compilation and Usage for more details.
For compiling on Windows, see FFVship Windows Compilation.
Dependencies
For all build options the following are required:
make
To build the HIP/CUDA version
hipcc(AMD HIP SDK) ornvcc(NVIDIA CUDA SDK)
To build the Vulkan version
- Vulkan SDK (vulkan header + lib) (ideally LunarG)
slangcif rebuilding spv code is intended
Additionally, to build the FFvship cli tool:
- ffms2 (and libavutil header to compile)
- pkg-config
Build Instructions
- Use the appropriate target for your gpu or use case.
#libvship Build
make build BACKEND=Cuda # Build for the current systems Nvidia gpu
make build BACKEND=Cuda ARCH=ALL # Build for all supported Nvidia gpus
make build BACKEND=HIP # Build for the current systems AMD gpu
make build BACKEND=HIP ARCH=ALL # Build for all supported AMD gpus
make build MITIGATE_MALLOC_ASYNC=on #build for AMD gpu and add a workaround to current amd driver issues if you have them (will be slower than a normal build)
make shaderBuild # Force shader rebuild, requires slangc (optional)
make build BACKEND=Vulkan # Build release vulkan vship
make build BACKEND=Vulkan DEBUG=1 # Build Vulkan Vship with debugging symbols and validation layers
#FFVship CLI linux tool build (requires libvship built before FFVSHIP)
make buildFFVSHIP
- Install libvship and eventually the FFVship executable.
The install target automatically detects and installs only the components that were built.
If libvship is built, it will also create a pkgconfig file and install it directly.
make install
#for arch/fedora, you need to use another prefix:
make install PREFIX=/usr
If pkgconfig is installed, you should be able to do: pkg-config --modversion libvship to get vship version.
Library Usage
FFVship
To control the performance-to-VRAM trade-off, set the -g argument in FFVship to control the
number of GPU threads to use. You can also control the number of decoder threads with -t.
I recommend 4 GPU threads as a good compromise between performance and VRAM usage.
This contains only some of the numerous options offered by Vship I recommend checking the doc or using -h to get the full list.
usage: ./FFVship [-h] [--source SOURCE] [--encoded ENCODED]
[-m {SSIMULACRA2, Butteraugli, CVVDP}]
[--start start] [--end end] [-e --every every]
[-t THREADS] [-g gpuThreads] [--gpu-id gpu_id]
[--json OUTPUT]
[--list-gpu]
Vapoursynth
To control the performance-to-VRAM trade-off, set the numStream argument to control
how many GPU threads to use. I recommend 4 as a good compromise between both.
SSIMULACRA2
See SSIMULACRA2 for details like calculating VRAM usage.
import vapoursynth as vs
core = vs.core
# Load reference and distorted clips
ref = core.bs.VideoSource("reference.mp4")
dist = core.bs.VideoSource("distorted.mp4")
# Calculate SSIMULACRA2 scores
#numStream controls the performance-to-VRAM trade-off
result = ref.vship.SSIMULACRA2(dist, numStream = 4)
# Extract scores from frame properties
scores = [frame.props["_SSIMULACRA2"] for frame in result.frames()]
# Print average score
print(f"Average SSIMULACRA2 score: {sum(scores) / len(scores)}")
Butteraugli
See BUTTERAUGLI for more details like calculating VRAM usage.
import vapoursynth as vs
core = vs.core
# Load reference and distorted clips
ref = core.bs.VideoSource("reference.mp4")
dist = core.bs.VideoSource("distorted.mp4")
# Calculate Butteraugli scores
# distmap controls whether to return a visual distortion map or the reference clip
# intensity_multiplier controls sensitivity
# qnorm controls which Norm to calculate and return in _BUTTERAUGLI_QNorm
result = ref.vship.BUTTERAUGLI(dist, distmap = 0, numStream = 4, qnorm = 5)
# Extract scores from frame properties (three different norms available)
scores_3norm = [frame.props["_BUTTERAUGLI_3Norm"] for frame in result.frames()]
scores_infnorm = [frame.props["_BUTTERAUGLI_INFNorm"] for frame in result.frames()]
scores_5norm = [frame.props["_BUTTERAUGLI_QNorm"] for frame in result.frames()]
# Alternatively get all scores in one pass
all_scores = [frame.props["_BUTTERAUGLI_3Norm"],
frame.props["_BUTTERAUGLI_INFNorm"],
frame.props["_BUTTERAUGLI_QNorm"]]
for frame in result.frames()]
# Print average scores
print(f"Average Butteraugli 3Norm distance: {sum(scores_3norm) / len(scores_3norm)})
print(f"Average Butteraugli MaxNorm distance: {sum(scores_infnorm) / len(scores_infnorm)})
print(f"Average Butteraugli 5Norm distance: {sum(scores_5norm) / len(scores_5norm)})
# Output grayscale visualation of distortion for visual analysis
ref.vship.BUTTERAUGLI(dist, distmap = 1, numStream = 4).set_output()
CVVDP
See CVVDP for more details like calculating VRAM usage.
import vapoursynth as vs
core = vs.core
# Load reference and distorted clips
ref = core.bs.VideoSource("reference.mp4")
dist = core.bs.VideoSource("distorted.mp4")
# Calculate CVVDP scores
# distmap controls whether to return a visual distortion map or the reference clip
# model_name controls which Display Model to use
# resizeToDisplay conrols whether or not to resize the reference and distorted inputs to the Display Model resolution
result = ref.vship.CVVDP(dist, distmap = 0, model_name = "standard_fhd", resizeToDisplay = 0)
# Extract scores from frame properties
#wrong if temporal behavior is enabled since this method may not process frames in order
#scores = [frame.props["_CVVDP"] for frame in result.frames()]
#correct is:
scores = []
for i in range(len(result)):
scores.append(result.get_frame(i).props["_CVVDP"])
# Only use the last score of CVVDP. (it takes into account every frame that it has seen up to now)
#it is different because it is an actually temporal metric unlike others
print(f"CVVDP Video Score: {scores[-1]}")
# Output grayscale visualation of distortion for visual analysis
ref.vship.CVVDP(dist, distmap = 1).set_output()
Performance
Testing Hardware: Ryzen 7940HS + RTX 4050 Mobile (strong CPU - weak GPU configuration)
Testing Clip: 1080p 1339 frames
| SSIMU2 Implementation | HW Type | Time |
|---|---|---|
| JXL | CPU | 115s |
| VSZIP | CPU | 58.120s |
| fssimu2 | CPU | 60.222s |
| ssimulacra2_rs | CPU | X s |
| --- CUDA --- | --- | --- |
| Vapoursynth Vship | GPU | 10.351s |
| FFVship | GPU | 7.146s |
| --- Vulkan --- | --- | --- |
| Vapoursynth Vship | GPU | 16.046s |
| FFVship | GPU | 10.085s |
| Butteraugli Implementation | HW Type | Time |
|---|---|---|
| JXL | CPU | 239s |
| --- CUDA --- | --- | --- |
| Vapoursynth Vship | GPU | 39.528s |
| FFVship | GPU | 38.710s |
| --- Vulkan --- | --- | --- |
| Vapoursynth Vship | GPU | 47.961s |
| FFVship | GPU | 41.484s |
| CVVDP Implementation | HW Type | Time |
|---|---|---|
| Original Repo | CPU | 1632s |
| FFVship (Vulkan llvmPipe) | CPU | 499s |
| fcvvdp | CPU | 360.893s |
| Original Repo | GPU | 162s |
| --- CUDA --- | --- | |
| Vapoursynth Vship | GPU | 28.558s |
| FFVship | GPU | 23.197s |
| --- Vulkan --- | --- | |
| Vapoursynth Vship | GPU | 53.750s |
| FFVship | GPU | 41.297s |
vship dramatically outperforms CPU-based and GPU-based implementations of these metrics
while preserving a high degree of accuracy.
References
- Butteraugli Source Code: libjxl/libjxl
- SSIMULACRA2 Source Code: cloudinary/ssimulacra2
- CVVDP Source Code: gfxdisp/ColorVideoVDP
Credits
Special thanks to dnjulek for the Zig-based SSIMULACRA2 implementation in vszip.
License
This project is licensed under the MIT license. License information is provided by the LICENSE.