Create Endpoint
curl --request POST \
--url https://api.novita.ai/gpu-instance/openapi/v1/endpoint/create \
--header 'Authorization: <authorization>' \
--header 'Content-Type: <content-type>' \
--data '
{
"endpoint": {
"name": "<string>",
"appName": "<string>",
"workerConfig": {
"minNum": 123,
"maxNum": 123,
"freeTimeout": 123,
"maxConcurrent": 123,
"gpuNum": 123,
"requestTimeout": 123
},
"ports": [
{
"port": "<string>"
}
],
"policy": {
"type": "<string>",
"value": 123
},
"image": {
"image": "<string>",
"authId": "<string>",
"command": "<string>"
},
"products": [
{
"id": "<string>"
}
],
"rootfsSize": 123,
"volumeMounts": [
{
"type": "<string>",
"size": 123,
"id": "<string>",
"mountPath": "<string>"
}
],
"clusterID": "<string>",
"envs": [
{
"key": "<string>",
"value": "<string>"
}
],
"healthy": {
"path": "<string>"
}
}
}
'{
"id": "<string>"
}Serverless GPUs
Create Endpoint
POST
/
gpu-instance
/
openapi
/
v1
/
endpoint
/
create
Create Endpoint
curl --request POST \
--url https://api.novita.ai/gpu-instance/openapi/v1/endpoint/create \
--header 'Authorization: <authorization>' \
--header 'Content-Type: <content-type>' \
--data '
{
"endpoint": {
"name": "<string>",
"appName": "<string>",
"workerConfig": {
"minNum": 123,
"maxNum": 123,
"freeTimeout": 123,
"maxConcurrent": 123,
"gpuNum": 123,
"requestTimeout": 123
},
"ports": [
{
"port": "<string>"
}
],
"policy": {
"type": "<string>",
"value": 123
},
"image": {
"image": "<string>",
"authId": "<string>",
"command": "<string>"
},
"products": [
{
"id": "<string>"
}
],
"rootfsSize": 123,
"volumeMounts": [
{
"type": "<string>",
"size": 123,
"id": "<string>",
"mountPath": "<string>"
}
],
"clusterID": "<string>",
"envs": [
{
"key": "<string>",
"value": "<string>"
}
],
"healthy": {
"path": "<string>"
}
}
}
'{
"id": "<string>"
}Request Headers
Enum:
application/jsonBearer authentication format, for example: Bearer {{API Key}}.
Request Body
Endpoint configuration.
Hide properties
Hide properties
Endpoint name. String with a length limit of 0-220 characters.
Application name (reflected in the URL). The application name is part of the Endpoint URL, supports customization, and defaults to the Endpoint ID.
Worker configuration. The valid range is dynamically retrieved through the parameter limits API.
Show properties
Show properties
Minimum number of workers.
Maximum number of workers.
Idle timeout in seconds.
Maximum concurrency.
Number of GPUs per worker.
Request timeout (seconds).
HTTP ports. Only one port is supported. Supported port range: 1-65535, excluding internal ports 2222, 2223, and 2224.
Show properties
Show properties
HTTP port.
Scaling policy. The valid range is dynamically retrieved through the parameter limits API.
Show properties
Show properties
Scaling policy type. Available values:
queue: Queue latency policy, scales workers based on request wait time in the queue.concurrency: Queue request policy, scales workers based on the number of requests in the queue.
The meaning of value depends on the type:
- When type = queue, value represents the queue wait time in seconds.
- When type = concurrency, value represents the maximum number of requests in the queue.
Container image configuration.
Show properties
Show properties
Image address. String with a length limit of 0-511 characters.
Private image credential ID (not required for public images or platform user images). String with a length limit of 0-255 characters.
Container startup command. String with a length limit of 0-2047 characters.
Root filesystem size in GB. Currently fixed at 100.
Storage configuration in GB.
Show properties
Show properties
Storage type. Available values:
local: Local storage.network: Network storage.
Local storage size, currently fixed at 30. Not required for network storage.
Network storage ID. Not required for local storage.
Storage mount path. String with a length limit of 0-255 characters.
Cluster information. Required when mounting cloud storage and must match the cluster ID where the cloud storage is located. String with a length limit of 0-255 characters.
Response
The created Endpoint ID.
Last modified on June 10, 2026
Was this page helpful?
⌘I