

turian / insanely-fast-whisper-with-video
whisper-large-v3, incredibly fast, with video transcription
27.1M runs


prunaai / p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
10.1M runs


jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
76.5M runs


andreasjansson / clip-features
Return CLIP features for the clip-vit-large-patch14 model
136M runs

Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding
80.8K runs

A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design
7.3K runs

Very fast image generation and editing model. 4 steps distilled, sub-second inference for production and near real-time applications.
1.4M runs
bytedance/seedance-1.5-proA joint audio-video model that accurately follows complex instructions.
232.2K runs

An enhanced version over Qwen-Image-Edit-2509, featuring multiple improvements including notably better consistency
483.4K runs

openai/gpt-image-1.5OpenAI's latest image generation model with better instruction following and adherence to prompts
1.8M runs
philz1337x/crystal-video-upscalerHigh-precision video upscaler optimized for portraits, faces and products. One of the upscale modes powered by Clarity AI. X:https://x.com/philz1337x
831 runs

openai/gpt-5.2The best model for coding and agentic tasks across industries
179.7K runs

bytedance/seedream-4.5Seedream 4.5: Upgraded Bytedance image model with stronger spatial understanding and world knowledge
2.3M runs

prunaai/z-image-turboZ-Image Turbo is a super fast text-to-image model of 6B parameters developed by Tongyi-MAI.
15.2M runs

Google's state of the art image generation and editing model 🍌🍌
10.9M runs
New and improved version of Veo 3, with higher-fidelity video, context-aware audio, reference image and last frame support
314K runs
Official models are always on, maintained, and have predictable pricing.

A version of FLUX.2 [klein] 9B-base that supports fast fine-tuned lora inference

A version of FLUX.2 [klein] 4B-base that supports fast fine-tuned lora inference
Use audio input with an image or prompt to generate videos

Moonshot AI's latest open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model

Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding

A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design

4 step distilled version of FLUX.2 [klein]. A foundation model for maximum flexibility and control

Un-distilled version of FLUX.2 [klein]. A foundation model for maximum flexibility and control

Un-distilled version of FLUX.2 [klein]. Optimized for fine-tuning, customization, and post-training workflows

Very fast image generation and editing model. 4 steps distilled, sub-second inference for production and near real-time applications.

Most powerful iteration of Riverflow model from Sourceful, ideal for brand asset generation

Main version of Riverflow Image Model from Sourceful, ideal for brand design

Fast version of Sourceful Riverflow image generation model, ideal for brand assets
The first open source audio-video model
Kling 2.6 Pro: Top-tier image-to-video with cinematic visuals, fluid motion, and native audio generation

Qwen Image 2512 is an improved version of Qwen Image with more realistic human generation, finer textures, and stronger text rendering
A joint audio-video model that accurately follows complex instructions.
Enables precise control of character actions and expressions from a reference image.

An enhanced version over Qwen-Image-Edit-2509, featuring multiple improvements including notably better consistency

OpenAI's latest image generation model with better instruction following and adherence to prompts
Use AI to generate images & photos with an API
Use AI to caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale images with super resolution with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text via API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models: advanced image generation & editing via API
Use AI to restore images via API
Use AI to enhance videos via API - Replicate
Detect NSFW content in images and text
Classify text by sentiment, topic, intent, or safety
Identify speakers from audio and video inputs
Replace faces across images with natural-looking results.
Transform rough sketches into polished visuals
Generate custom emojis from text or images
Create anime-style characters, scenes, and animations
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Use AI to Generate Videos from Images with API
Use AI to generate lipsync videos with an API
Use AI to create 3D content with an API
Chat with images for understanding, captioning & detection via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI to edit your videos with an API
Use AI object detection and segmentation models to distinguish objects in images & videos
Official AI models: Always available, stable, and predictably priced
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
WAN family of models: powerful image-to-video & text-to-video models
Use AI To Caption Images with an API


vandalwraith / aunt01
15 runs

black-forest-labs / flux-2-klein-9b-base-lora
A version of FLUX.2 [klein] 9B-base that supports fast fine-tuned lora inference
32 runs

black-forest-labs / flux-2-klein-4b-base-lora
A version of FLUX.2 [klein] 4B-base that supports fast fine-tuned lora inference
26 runs

lightricks / audio-to-video
Use audio input with an image or prompt to generate videos
101 runs

moonshotai / kimi-k2.5
Moonshot AI's latest open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model
171 runs


prunaai / z-image
An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
1.5K runs


ton731 / floorplan-recognition
Segments the wall, door, window, and kitchen of the floor plan image
2 runs

google / gemini-3-flash
Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding
80.8K runs


vicckuo / generate-vic
6 runs


sundai-club / musicgen-eulogy
MusicGen fine-tuned on Eulogy from Stranger Things
38 runs

qwen / qwen3-tts
A unified Text-to-Speech demo featuring three powerful modes: Voice, Clone and Design
7.3K runs


waynedev20 / didi
41 runs