Rate limits

1. Understanding Rate Limits

Rate limiting refers to the rules that govern how frequently a user’s API may access our platform services within a given time window. Its goals are:

Prevent API abuse and misuse: By setting an upper bound on request frequency, we limit abnormal or unintended traffic.
Ensure fair resource allocation: We prevent a small number of users from monopolizing system resources, so everyone can access the service on equal terms.
Maintain API performance and reliability: By smoothing out traffic spikes, we stabilize response times and reduce failure rates due to overload, improving overall service quality.
Protect system stability: Rate limits help absorb sudden bursts of traffic, preventing high-concurrency spikes from overwhelming our infrastructure.

2. Rate Limiting Metrics

For our image and video models, we use two primary rate-limiting metrics:

IPM (Images Per Minute): The number of images an image model is allowed to generate per minute.
RPM (Requests Per Minute): The number of API requests a video model is allowed to handle per minute.

3. Default Rate Limits

For different models, our platform applies differentiated rate-limiting strategies based on each model’s computational resource consumption.

IPM

The IPM (Images Per Minute) limit controls the number of images that can be generated per minute. The default IPM values for each model are listed in the table below.

Resource/Service	Model API	User Default Settings
Text to Image	`txt2img_v3`	20
Image to Image	`img2img_v3`	10
Remove Background	`remove_background`	10
Replace Background	`replace_background`	10
Remove Text	`remove_text`	10
Inpainting	`inpainting`	10
Cleanup	`cleanup`	10
Merge Face	`merge_face`	10
FLUX.1 [schnell] Text to Image	`flux-1-schnell`	10
Upscale	`upscale_v3`	20

RPM

The RPM (Requests Per Minute) limit controls the number of API requests allowed per minute. The default RPM values for each model are provided in the table below.

Resource/Service	Model API	User Default Settings
Video Merge Face	`video_merge_face`	10
Text to Video	`txt2video`	2
Image to Video	`img2video`	2
Wan 2.1 Text to Video	`wan_txt_to_video`	20
Wan 2.1 Image to Video	`wan i2v`	20
Hunyuan Video Fast	`hunyuan_video_fast`	20
KLING V1.6 Image to Video	`Kling i2v`	20
KLING V1.6 Text to Video	`Kling t2v`	20
Minimax Video-01	`Minimax`	20

4. Handling Rate Limits

How to Monitor Rate Limits?

If your API requests exceed the allowed rate, the API will return:

HTTP Status Code: 429 Too Many Requests
Response Body: A message indicating that the rate limit has been exceeded.

Best Practices

To avoid triggering rate limits, you can:

Implement client-side request throttling: Respect the platform’s rate limits by controlling your application’s request rate, ensuring you don’t send too many calls in a short period.
Use exponential backoff on retries: When you receive a rate-limit response (e.g., HTTP 429), wait progressively longer between retry attempts instead of retrying immediately, reducing load on the service.
Monitor your API usage: Continuously track and log your request counts, frequencies, and any error responses so you can adjust your usage patterns proactively.

When You Hit Rate Limits

If you receive an HTTP 429 (“Too Many Requests”) response, you can:

Retry later: Wait a short period before sending your request again.
Optimize your requests: Reduce the frequency of calls to stay within the platform’s rate limits.
Request a higher rate limit: If you require a higher rate limit, please fill out the contact form at the Quotas&Limits to get in touch with us.

Get started

Model APIs

Agent Sandbox

GPUs

Observability

Resources

1. Understanding Rate Limits

2. Rate Limiting Metrics

3. Default Rate Limits

IPM

RPM

4. Handling Rate Limits

How to Monitor Rate Limits?

Best Practices

When You Hit Rate Limits

Get started

Model APIs

Agent Sandbox

GPUs

Observability

Resources

​1. Understanding Rate Limits

​2. Rate Limiting Metrics

​3. Default Rate Limits

​IPM

​RPM

​4. Handling Rate Limits

​How to Monitor Rate Limits?

​Best Practices

​When You Hit Rate Limits

1. Understanding Rate Limits

2. Rate Limiting Metrics

3. Default Rate Limits

IPM

RPM

4. Handling Rate Limits

How to Monitor Rate Limits?

Best Practices

When You Hit Rate Limits