parakeet-server

A self-hosted, OpenAI-compatible speech-to-text API powered by NVIDIA's Parakeet TDT 0.6B v3 model.

Point Whisper-compatible clients at this server and keep everything local.

Why this project

OpenAI Whisper API compatible endpoint: POST /v1/audio/transcriptions
Automatic model download and local caching on first use
Multiple response formats: json, text, srt, vtt, verbose_json
Built-in browser UI at http://localhost:8000
Docker-friendly deployment and Rust-native runtime

Quick start

Option 1: Run from GHCR image

docker pull ghcr.io/chand1012/parakeet-server:latest
docker run --rm -it \
  -p 8000:8000 \
  -v "$(pwd)/models:/home/parakeet/models" \
  ghcr.io/chand1012/parakeet-server:latest

Option 2: Build and run locally

Prerequisites: Rust 1.93+, cmake, protobuf-compiler, pkg-config, OpenSSL dev libs.

cargo build --release
./target/release/parakeet-server

Server starts on http://localhost:8000.

Docker Compose (modern)

name: parakeet

services:
  app:
    image: ghcr.io/chand1012/parakeet-server:latest
    pull_policy: always
    init: true
    container_name: parakeet-server
    restart: unless-stopped
    ports:
      - "8000:8000"
    environment:
      ROCKET_PORT: "8000"
      RUST_LOG: info
    volumes:
      - ./models:/home/parakeet/models

Then run:

docker compose up -d

API usage

Basic transcription

curl http://localhost:8000/v1/audio/transcriptions \
  -F file=@audio.mp3 \
  -F model=whisper-1

Optional form fields

Field	Type	Required	Notes
`file`	file	Yes	Audio upload (max 100 MB)
`model`	string	No	Default: `whisper-1`
`response_format`	string	No	`json` (default), `text`, `srt`, `vtt`, `verbose_json`
`language`	string	No	Default: `en`

Response examples

json:

{ "text": "Hello, world." }

verbose_json:

{
  "task": "transcribe",
  "language": "en",
  "duration": 1.5,
  "text": "Hello, world.",
  "segments": [
    { "id": 0, "seek": 0, "start": 0.0, "end": 1.5, "text": "Hello, world." }
  ]
}

srt:

1
00:00:00,000 --> 00:00:01,500
Hello, world.

OpenAI client compatibility

Works with standard OpenAI SDKs by overriding base_url:

from openai import OpenAI

client = OpenAI(
    api_key="not-needed",
    base_url="http://localhost:8000/v1",
)

with open("audio.mp3", "rb") as f:
    result = client.audio.transcriptions.create(
        model="whisper-1",
        file=f,
    )

print(result.text)

Browser UI

Open http://localhost:8000 and upload an audio file to test transcription in the built-in web interface.

Model and processing details

Model: NVIDIA Parakeet TDT 0.6B v3 (INT8)
Runtime: ONNX Runtime via transcribe-rs
Audio decoding: Symphonia and ffmpeg fallback path
Target sample rate: 16 kHz mono
Model cache path: models/parakeet-tdt-0.6b-v3-int8/

On first use, the model archive is fetched automatically from:

https://blob.handy.computer/parakeet-v3-int8.tar.gz

Development

# build
cargo build
cargo build --release

# test
cargo test
cargo test <test_function_name>

# lint / format
cargo fmt --check
cargo fmt
cargo clippy

There is also a helper script for manual endpoint testing:

bash test_transcribe.sh path/to/audio.mp3

Benchmarking transcription performance

Benchmarks throughput using the OpenSLR SLR81 test set with configurable parallel workers:

# Download test audio samples (~2.5GB)
./download_samples.sh

# Run benchmark (default: 4 parallel workers)
parakeet-bench

# Custom parallelism
parakeet-bench -j 8

Reports files processed, average duration, and throughput (files/second).

License

MIT. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
.vscode		.vscode
src		src
static		static
.dockerignore		.dockerignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
Rocket.toml		Rocket.toml
download_samples.sh		download_samples.sh
run.sh		run.sh
test_transcribe.sh		test_transcribe.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

parakeet-server

Why this project

Quick start

Option 1: Run from GHCR image

Option 2: Build and run locally

Docker Compose (modern)

API usage

Basic transcription

Optional form fields

Response examples

OpenAI client compatibility

Browser UI

Model and processing details

Development

Benchmarking transcription performance

License

About

Uh oh!

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

parakeet-server

Why this project

Quick start

Option 1: Run from GHCR image

Option 2: Build and run locally

Docker Compose (modern)

API usage

Basic transcription

Optional form fields

Response examples

OpenAI client compatibility

Browser UI

Model and processing details

Development

Benchmarking transcription performance

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages