

andreasjansson / clip-features
Return CLIP features for the clip-vit-large-patch14 model
143.6M runs


prunaai / p-image-edit
A sub 1 second 0.01$ multi-image editing model built for production use cases. For image generation, check out p-image here: https://replicate.com/prunaai/p-image
19.1M runs


vaibhavs10 / incredibly-fast-whisper
whisper-large-v3, incredibly fast, powered by Hugging Face Transformers! 🤗
28.1M runs


jaaari / kokoro-82m
Kokoro v1.0 - text-to-speech (82M params, based on StyleTTS2)
82.5M runs

Highest-quality text-to-speech with <200ms latency, emotion control, and 15-language support
537 runs

bytedance/seedream-5-liteSeedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge
114.3K runs
runwayml/gen-4.5State-of-the-art video motion quality, prompt adherence and visual fidelity
26.4K runs

recraft-ai/recraft-v4Recraft's latest image generation model, built around design taste. Strong prompt accuracy, art-directed composition, and integrated text rendering. Fast and cost-efficient at standard resolution.
25.2K runs
Generate videos using xAI's Grok Imagine Video model
155.4K runs

Moonshot AI's latest open model. It unifies vision and text, thinking and non-thinking modes, and single-agent and multi-agent execution into one model
20.6K runs

Google's most intelligent model built for speed with frontier intelligence, superior search, and grounding
504.4K runs
prunaai/p-videoFast video generation with built-in draft mode for rapid creative iteration. Text-to-video, image-to-video, and audio-to-video in a single endpoint.
258.6K runs

Very fast image generation and editing model. 4 steps distilled, sub-second inference for production and near real-time applications.
6.5M runs

openai/gpt-image-1.5OpenAI's latest image generation model with better instruction following and adherence to prompts
5.1M runs

Google's fast image generation model with conversational editing, multi-image fusion, and character consistency
3.1M runs

Compose a song from a prompt or a composition plan
10.7K runs
Official models are always on, maintained, and have predictable pricing.

Ultra-fast, cost-efficient text-to-speech with ~120ms latency and 15-language support

Highest-quality text-to-speech with <200ms latency, emotion control, and 15-language support
High-fidelity video generation with portrait support, audio-to-video, retake, and extend. Text, image, and audio-driven creation up to 4K at 50 FPS.

OpenAI's most capable frontier model for complex professional work, coding, and multi-step reasoning.
Lightning-fast video generation with portrait support, camera controls, and synchronized audio. Up to 20 seconds at 1080p, 4K at 50 FPS.
Kling 3.0 motion control: transfer motion from a reference video to any character image with improved consistency and quality.
Fast video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.

The pro version of Qwen Image 2 from Alibaba's Qwen team. Enhanced text rendering, realism, and semantic adherence for high-quality image generation and editing.

A next-generation image generation and editing model from Alibaba's Qwen team. Supports text-to-image and image editing with strong text rendering, especially for Chinese.
Create realistic talking avatar videos from text with HeyGen's Avatar IV engine

Generate full-length songs with vocals, lyrics, and rich instrumentation from a text prompt
High-fidelity video generation with text-to-video, image-to-video, and start-end-to-video modes. Up to 16 seconds at 1080p with synchronized audio.
Turn a text prompt into a complete, polished video with AI-generated script, avatar presenter, voiceover, visuals, and editing.

Seedream 5.0 lite: image generation with built-in reasoning, example-based editing, and deep domain knowledge

Google's most intelligent model, with improved reasoning and a new medium thinking level
State-of-the-art video motion quality, prompt adherence and visual fidelity
Generate detailed SVG vector graphics from text prompts. Recraft V4 Pro's design taste with more geometric detail and finer paths — clean layers, editable output, and scalable to any size.
Generate production-ready SVG vector images from text prompts. Recraft V4's design taste applied to vector output — clean geometry, structured layers, and editable paths.

Recraft's latest image generation model at ~2048px resolution. Same design taste and prompt accuracy as V4, with higher resolution for print-ready and large-scale work.

Recraft's latest image generation model, built around design taste. Strong prompt accuracy, art-directed composition, and integrated text rendering. Fast and cost-efficient at standard resolution.
Use AI to generate images & photos with an API
Use AI to caption videos with an API
Use AI for text-to-speech or to clone your voice via API
Use AI to generate images from a face with an API
Use AI to generate videos with an API
Use AI to upscale images with super resolution with an API
Use AI to generate music with an API
Use AI to edit any image via API
Use AI to transcribe speech to text via API
Use AI For Optical Character Recognition (OCR) to extract text from images via API
Use AI to remove backgrounds from images and videos with an API
FLUX AI models: advanced image generation & editing via API
Use AI to restore images via API
Use AI to enhance videos via API - Replicate
Detect NSFW content in images and text
Classify text by sentiment, topic, intent, or safety
Identify speakers from audio and video inputs
Replace faces across images with natural-looking results.
Transform rough sketches into polished visuals
Generate custom emojis from text or images
Create anime-style characters, scenes, and animations
Official models are always on, predictably priced, and have a stable API.
Explore Large Language Models (LLMs) for chat, generation & NLP tasks via API
Try AI Models for free: video generation, image generation, upscaling, and photo restoration
Use AI to Generate Videos from Images with API
Use AI to generate lipsync videos with an API
Use AI to create 3D content with an API
Chat with images for understanding, captioning & detection via API
Use AI to control image generation with an API
Embedding models for AI search and analysis
Use AI to edit your videos with an API
Use AI object detection and segmentation models to distinguish objects in images & videos
Flux fine-tunes: build and run custom AI image models via API
Kontext fine-tunes: Build custom AI image models with an API
Create songs with voice cloning models via API
AI media utilities: auto-caption, watermark, frame extraction & more via API
Browse the diverse range of qwen-image fine-tunes the community has custom-trained on Replicate.
WAN family of models: powerful image-to-video & text-to-video models
Use AI To Caption Images with an API


deborahwon / pisces-rising-style
Arise V2
53 runs


esadorleans / moon_tvey
51 runs


jide / nano-banana-2-bg-remove
Remove backgrounds with real alpha transparency using Nano Banana 2. Triangulation matting produces clean edges, proper semi-transparency, and accurate colors — superior to traditional background removal. Optionally specify what to isolate.
24 runs


jide / nano-banana-2-transparent
Nano Banana 2 with alpha transparency. Generates images with real RGBA transparency using triangulation matting — clean edges, proper semi-transparency, and accurate colors. Powered by Google Gemini 3.1 Flash Image.
32 runs

inworld / tts-1.5-mini
Ultra-fast, cost-efficient text-to-speech with ~120ms latency and 15-language support
267 runs

inworld / tts-1.5-max
Highest-quality text-to-speech with <200ms latency, emotion control, and 15-language support
537 runs


tarasotherstuff-arch / windows-trees
14 runs


lundymatthew-wq / logo-overlay
Generate 6 logo variants at 4 sizes, plus single overlay and watermark compositing
14 runs


ultracoderru / nova-anime-xl-17
Nova Anime XL (Illustrious) v17.0
1.7K runs

resilientcoders / 90s-photographs-bjork
90's Photographs in the style of Bjorks promo material
7 runs

lightricks / ltx-2.3-pro
High-fidelity video generation with portrait support, audio-to-video, retake, and extend. Text, image, and audio-driven creation up to 4K at 50 FPS.
1.5K runs

resilientcoders / pokemon-trainers
Creates Pokemon trainers
11 runs