If you don’t have a Novita account, sign up first. For details, see the Quickstart guide. This article uses the ComfyUI worker imageDocumentation Index
Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
novitalabs/comfyui-worker:v0.0.1 as an example to show how to create and call an Async Serverless Endpoint.
1. Prepare Container Image
Package your runtime environment into a Docker image and upload it to an image registry in advance. Both public and private image registries are supported. Private registries require image pull credentials.- You can upload your image to Docker Hub. The platform currently provides an image warm-up service for Docker Hub images.
novitalabs/comfyui-worker:v0.0.1. The image includes ComfyUI and the Novita worker SDK. The task input is a ComfyUI workflow JSON, and the worker handler returns generated image results. We recommend configuring object storage environment variables such as BUCKET_ENDPOINT_URL, so generated images and videos can be uploaded to your bucket and returned as URLs in the job output.
2. Select Instance Specification
Async Serverless Endpoint currently supports the following GPU instance types:- RTX 4090 24GB
- H100 SXM 80GB
comfyui-worker example, we recommend RTX 4090 24GB.
For additional requirements, contact us.
3. Create Cloud Storage (Optional)
If you need shared or persistent storage, create cloud storage on the storage management page, then mount the storage when creating the endpoint. For details, see Manage Cloud Storage.4. Create Endpoint
- Go to the Async Serverless GPUs page, select an instance type, and click “Create Endpoint”.
- Complete the Endpoint parameter configuration.
- Endpoint Name: Used to uniquely identify the Endpoint. It is part of the URL when creating jobs. The system generates a random default name. You can customize it, but using the default name is recommended.
- Worker Configuration
| Configuration Item | Description |
|---|---|
| Min Worker Count | The minimum number of worker instances to keep for the Endpoint. Setting a higher minimum helps reduce cold start time. If set to 0, there will be no idle workers when there are no requests, which may increase response latency for new requests. Use 0 with caution for latency-sensitive scenarios. |
| Max Worker Count | The maximum number of worker instances that the Endpoint can scale up to. When request volume increases, the platform automatically increases workers up to this maximum. This limit helps control costs. |
| Idle Timeout (seconds) | When a worker is about to be released due to scale-down, the platform keeps it for the configured idle timeout so it can respond quickly to new requests. You are charged for the worker during this period. |
| Max Concurrent Requests | The maximum number of concurrent requests handled by one worker. If this is exceeded, requests are routed to other workers. If all workers are fully occupied, excess requests are queued until execution is possible. |
| GPUs / Worker | Number of GPU cards allocated to each worker. |
| CUDA Version | CUDA version used by the worker. |
GPUs / Worker to 1.
- Type:
- Select Async.
- Elastic Policy:
- Select Queue request policy.
- Set Single worker target concurrency to
1. The ComfyUI worker in this example processes one job at a time. When queued requests exceed current worker capacity, the platform scales workers based on the queue request count until reaching the maximum worker count.
- Image Configuration:
- Image address:
novitalabs/comfyui-worker:v0.0.1. - Image repository credentials: If the image is private, provide image pull credentials. You can create credentials on the security credentials management page.
- HTTP Port: Worker HTTP port.
- Container start command: Command executed when the container starts.
- Image address:
- Storage Configuration:
- System disk: System disk size per worker instance.
- Cloud storage: Select cloud storage if you need to mount it. For details, see Manage Cloud Storage.
- Other:
- Health check path: This parameter is currently not enabled.
- Environment variables: Set environment variables required by the service. Example S3 configuration:
comfyui-worker, we strongly recommend configuring object storage so output images are uploaded to a bucket and returned as URLs.
- Review pricing and click “Deploy with One Click”.
5. Access the Service
- On the Async Serverless GPUs page, find the newly created Endpoint and ensure its status is “Running”.
- Ensure that at least one Worker in the Endpoint is running.
- Ensure you have an API Key for authentication. The Endpoint creator and the API Key owner must belong to the same team.
| Parameter | Description |
|---|---|
| Public base URL | https://async-public.serverless.novita.ai/v1 |
| Endpoint Name | The name generated after creating the Endpoint, for example 0f43a6867e05fddd. This name is part of the job URL. |
| API Key | Create or copy an API Key from the API Key / Key Management page. Pass it in the Authorization: Bearer <API_KEY> request header. |
- Log in to the Novita console.
- Go to the API Key / Key Management page.
- Create an API Key and copy the generated
sk_...value. - Ensure the API Key owner and Endpoint owner are in the same team.
5.1 Create a Job and Retrieve Output via Curl
The following request is an executablecomfyui-worker example and matches the tested case. Replace 0f43a6867e05fddd in the URL with your real Endpoint name, and replace sk_xxxx with your real API Key.
The maximum job size accepted by Async Serverless Endpoint is 4 MiB.
id is the job_id:
The maximum output size returned by the Async Serverless Endpoint
status API is 4 MiB. To avoid this limitation, configure object storage environment variables and return uploaded file URLs in the output.Job results are kept in the Async Serverless Endpoint for up to 6 hours after completion.5.2 Create Job and Get Results via Novita SDK
Install the SDK:novita-gpus SDK default request URL is https://async-public.serverless.novita.ai/v1.