Releases: OpenNMT/CTranslate2
Releases · OpenNMT/CTranslate2
v4.7.1
Fixes and improvements
- Fix Windows build (#2007) @sssshhhhhh
v4.7.0
New features
- Introduce AMD GPU support with ROCm HIP (#1989) @sssshhhhhh
- Compatibility with Transformers v5 (#1999) by @jordimas
Fixes and improvements
- Assume less about whisper vocab (#2000) by @sssshhhhhh
- Use LLVM ThreadSanitizer instead of Google (#1993) by @3manifold
- Optimize all builds with parallel execution (#1992) by @3manifold
- Remove unecessary zero init from conv1d (#1990) by @sssshhhhhh
- Integrate Clang AddressSanitizer in tests (#1903) by @3manifold
- Enable multiple of 16 padding for INT8 Tensor Cores (#1982) by @Purfview
- Add activation and dilation to conv1d (#1979) by @sssshhhhhh
- Minor refactor to CMakeLists.txt (#1980) by @sssshhhhhh
- Remove unnecessary check from wav2vec2 (#1977) by @plan9better
- Add optional residual add to gemm op (#1975) by @sssshhhhhh
- Implement cuda layernorm axis (#1971) by @sssshhhhhh
- Fix Eole conversion (#1998) by @vince62s
- Gemma 3 conversion improvements (#1991) by @sssshhhhhh
- Add causal flag to fa2 (#1976) by @sssshhhhhh
- Fixes cross attention tests and refactors code (#1974) by @jordimas
- Fix CUDA bf16 median filter (#1972) by @sssshhhhhh
- Fix various compiler warnings (#1970) by @sssshhhhhh
v4.6.3
v4.6.3 (2026-01-06)
New features
- T5Gemma model conversion and inference (#1962) by @jordimas
- Support for CUDA 12.8 (#1937, #1940) by @Purfview
- Conv1d pure CUDA implementation (#1949), makes cuDNN an optional dependency by @jordimas
- Add CUDA implementation for median filter (#1917) by @ja2d8a4v
Fixes and improvements
CTranslate2 4.6.2
New features
Fixes and improvements
CTranslate2 4.6.1
CTranslate2 4.6.0
Note: The Ctranslate2 Python package now supports python 3.13, drop the support for python 3.8.
New features
- Pyhton 3.13 support (#1858)
- Support returning hidden vector in Wav2Vec2 and Wav2Vec2Bert Models (#1867)
- Add noexecstack linker flags (#1852 + #1861)
- Support Qwen2 (#1820)
- Eoleconv (#1832)
- Add support RobertModel (#1864)
Fixes and improvements
CTranslate2 4.5.0
Note: The Ctranslate2 Python package now supports CUDNN 9 and is no longer compatible with CUDNN 8.
New features
Fixes and improvements
CTranslate2 4.4.0
Removed: Flash Attention support in the Python package due to significant package size increase with minimal performance gain.
Note: Flash Attention remains supported in the C++ package with the WITH_FLASH_ATTN option.
Flash Attention may be re-added in the future if substantial improvements are made.
New features
- Support Llama3 (#1751)
- Support Gemma2 (#1772)
- Add log probs for all tokens in vocab (#1755)
- Grouped conv1d (#1749 + #1758)