Multi-model

The headline feature: send a single prompt to N models and get all results in one structured response. No more juggling Replicate, OpenAI, Runway, ElevenLabs, and a dozen other dashboards to find which model handles your use case best. Rimp spans 8 modalities behind one API key and one credit balance:

Modality	Endpoint	Mode	What it does
Image	`POST /v1/images`	sync `200`	Text-to-image and image-to-image
Video	`POST /v1/videos`	async `202`	Text-to-video and image-to-video
Voice	`POST /v1/voice`	sync `200`	Text-to-speech, voice cloning, transcription
Music	`POST /v1/music`	async `202`	Full songs, with or without vocals
Avatar	`POST /v1/avatars`	async `202`	Talking avatars and lipsync
Chat	`POST /v1/chat`	sync `200`	LLM chat completions
3D	`POST /v1/three-d`	async `202`	Text-to-3D and image-to-3D (GLB meshes)
Upscale	`POST /v1/upscale`	async `202`	Image upscaling and face restoration

Costs below are in credits — Rimp’s universal currency (≈ 1,000 credits per $1 of underlying provider cost). The figures are the base cost per unit, exactly as shown in Studio; your wallet debit then applies your plan’s margin multiplier (Free ×1, Pro & Studio ×1.6, Team ×1.3). Call GET /v1/models for the live catalog. cr = credits.

Models tagged Studio+ require a Studio or Team plan. Calling them on Free or Pro returns 402 plan_upgrade_required.

Image models

Slug	Provider	Tier	Credits	Capabilities
`flux-2-klein-9b`	Black Forest Labs	fast	3 cr / image	text-to-image
`seedream-5-lite`	ByteDance	fast	20 cr / image	text-to-image, image-to-image
`openai-image`	OpenAI	standard	40 cr / image	text-to-image, image-to-image
`nano-banana-pro`	Google (Gemini)	standard	40 cr / image	text-to-image, image-to-image
`qwen-image-2-pro`	Qwen	pro	40 cr / image	text-to-image
`flux-pro`	Black Forest Labs	pro	40 cr / image	text-to-image
`flux-pro-i2i`	Black Forest Labs	pro	50 cr / image	image-to-image
`imagen-4`	Google	pro	50 cr / image	text-to-image
`recraft-v4-pro`	Recraft	pro	50 cr / image	text-to-image (design + SVG)
`flux-2-pro`	Black Forest Labs	pro	60 cr / image	text-to-image, image-to-image
`ideogram-v2-turbo`	Ideogram	fast	50 cr / image	text-to-image (typography)
`imagen-4-ultra`	Google	pro	80 cr / image	text-to-image — Studio+
`ideogram-v2`	Ideogram	standard	80 cr / image	text-to-image (typography)
`flux-2-max`	Black Forest Labs	pro	100 cr / image	text-to-image — Studio+
`openai-gpt-image-2`	OpenAI	pro	210 cr / image	text-to-image, image-to-image — Studio+

Video models

Billed per second unless noted. Image-to-video models accept an image_url.

Slug	Provider	Tier	Credits	Capabilities
`runway-gen4-turbo`	Runway	fast	50 cr / sec	text-to-video, image-to-video
`kling-v3`	Kling	standard	100 cr / sec	text-to-video, image-to-video
`seedance-2-0`	ByteDance	standard	120 cr / sec	text-to-video (audio sync)
`kling-v3-omni`	Kling	pro	280 cr / sec	text-to-video, image-to-video (audio + 4K) — Studio+
`runway-gen4-5`	Runway	pro	300 cr / sec	text-to-video, image-to-video — Studio+
`openai-sora-2`	OpenAI	standard	300 cr / sec	text-to-video (synced audio) — Studio+
`luma-ray2`	Luma	standard	320 cr / Mpx	text-to-video, image-to-video — Studio+
`veo-3-1-fast`	Google Veo	fast	400 cr / sec	text-to-video, image-to-video
`openai-sora-2-pro`	OpenAI	pro	500 cr / sec	text-to-video, image-to-video (synced audio) — Studio+
`veo-3-1-standard`	Google Veo	standard	750 cr / sec	text-to-video, image-to-video — Studio+

Voice models

Slug	Provider	Tier	Credits	Capabilities
`inworld-realtime-tts-2`	Inworld	fast	20 cr / 1k chars	text-to-voice
`minimax-speech-2-8-hd`	MiniMax	pro	30 cr / 1k chars	text-to-voice, voice cloning
`elevenlabs-scribe-v2`	ElevenLabs	pro	12 cr / min	transcription

Music models

Slug	Provider	Tier	Credits	Capabilities
`minimax-music-2-8`	MiniMax	standard	100 cr / song	text-to-music (vocals)
`lyria-3-pro`	Google	pro	120 cr / song	text-to-music (up to 3 min)

Avatar models

Billed per second of output. Two request shapes: text-to-avatar (text + optional avatar_id/voice_id) or lipsync (video_url + audio_url).

Slug	Provider	Tier	Credits	Capabilities
`kling-lip-sync`	Kling	standard	40 cr / sec	lipsync
`sync-lipsync-2-pro`	Sync Labs	pro	50 cr / sec	lipsync
`heygen-avatar-v`	HeyGen	standard	67 cr / sec	lipsync
`heygen-avatar-iv`	HeyGen	standard	67 cr / sec	lipsync — Studio+
`kling-avatar-v2`	Kling	pro	120 cr / sec	lipsync (realistic + cartoon) — Studio+

3D models

Output is a GLB mesh URL. Billed per mesh.

Slug	Provider	Tier	Credits	Capabilities
`trellis-3d`	TRELLIS	fast	33 cr / model	image-to-3D
`hunyuan-3d-3-1`	Tencent	standard	400 cr / model	text-to-3D, image-to-3D
`hyper3d-rodin`	Rodin	pro	400 cr / model	text-to-3D, image-to-3D, rigged characters — Studio+
`trellis-2-pro`	TRELLIS	pro	820 cr / model	image-to-3D (PBR materials) — Studio+

Upscale models

Slug	Provider	Tier	Credits	Capabilities
`real-esrgan`	Real-ESRGAN	fast	2 cr / image	image upscale, face restore
`gfpgan-face-upscale`	GFPGAN	standard	4 cr / image	face restore
`recraft-crisp-upscale`	Recraft	standard	4 cr / image	image upscale (commercial-safe)
`topaz-image-upscale`	Topaz	pro	3 cr / Mpx	image upscale, face restore — Studio+
`clarity-upscaler`	Clarity	pro	17 cr / image	image upscale (Magnific-like)

Chat models

Billed per output token. The figures below are credits per 1M output tokens — a single turn typically uses far fewer (the default cap is 1,024 tokens). Charged on max_tokens, then reconciled with actual usage on completion.

Slug	Provider	Tier	Credits / 1M tokens	Capabilities
`deepseek-v3`	DeepSeek	fast	300	text
`meta-llama-4-scout`	Meta	fast	350	text, multimodal (10M context)
`meta-llama-4-maverick`	Meta	standard	500	text, multimodal
`openai-gpt-4o-mini`	OpenAI	fast	600	text, tool use
`deepseek-v4-pro`	DeepSeek	pro	800	text, tool use
`qwen-3-max`	Qwen	pro	1,200	text, tool use
`anthropic-claude-haiku-4-5`	Anthropic	fast	1,250	text, tool use
`openai-gpt-5-mini`	OpenAI	fast	2,000	text, tool use, multimodal
`mistral-large-3`	Mistral	standard	3,000	text, tool use — Studio+
`gemini-3-pro`	Google	pro	7,500	text, tool use, multimodal — Studio+
`openai-gpt-4o`	OpenAI	pro	10,000	text, tool use, multimodal — Studio+
`anthropic-claude-sonnet-4-5`	Anthropic	standard	15,000	text, tool use, multimodal — Studio+
`xai-grok-4`	xAI	pro	15,000	text, tool use — Studio+
`openai-gpt-5`	OpenAI	pro	30,000	text, tool use, multimodal — Studio+
`anthropic-claude-opus-4-7`	Anthropic	pro	75,000	text, tool use, multimodal — Studio+

Comparison API

Send one prompt to several models at once. Works across any models that share a modality.

const cmp = await client.comparisons.create({
  prompt: 'cinematic portrait of a fisherman at golden hour, 35mm film',
  models: ['flux-pro', 'imagen-4', 'ideogram-v2'],
  params_shared: { aspect_ratio: '1:1' },
});

Atomic credit reservation

Rimp sums the estimated cost of all N models and reserves it in one transaction. If your wallet can’t cover the sum, you get 402 Insufficient credits and nothing is charged — not even the models that would have fit.

Plan limits

The number of models per comparison is capped by your plan:

Plan	Max models per comparison
Free	2
Pro	3
Studio	6
Team	12

Polling vs webhooks

A comparison returns 202 with a parent comparison object. Two ways to wait for results:

const result = await client.comparisons.waitFor(cmp.id);
// resolves when all child generations reach a terminal state
result.generations.forEach((g) => console.log(g.model, g.outputs[0]?.url));

Fetch the full result any time with GET /v1/comparisons/{id} — it returns the parent plus every child generation (status, charged credits, and signed output URLs).

Capability matrix

Not every model supports every operation. Check the capabilities array on each model (visible in the tables above and in GET /v1/models):

{
  "slug": "flux-pro-i2i",
  "modality": "image",
  "capabilities": ["image_to_image"]
}

Common capabilities: text_to_image, image_to_image, text_to_video, image_to_video, text_to_voice, voice_clone, transcription, text_to_music, lipsync, text_to_text, tool_use, multimodal_input, text_to_3d, image_to_3d, rigged_3d, image_upscale, face_restore.

Get started

Core concepts

Webhooks

SDKs

AI tools

Multi-model

Image models

Video models

Voice models

Music models

Avatar models

3D models

Upscale models

Chat models

Comparison API

Atomic credit reservation

Plan limits

Polling vs webhooks

Capability matrix

Get started

Core concepts

Webhooks

SDKs

AI tools

Documentation Index

​Image models

​Video models

​Voice models

​Music models

​Avatar models

​3D models

​Upscale models

​Chat models

​Comparison API

​Atomic credit reservation

​Plan limits

​Polling vs webhooks

​Capability matrix

Image models

Video models

Voice models

Music models

Avatar models

3D models

Upscale models

Chat models

Comparison API

Atomic credit reservation

Plan limits

Polling vs webhooks

Capability matrix