Endpoint name. String, length limit: 0-220 characters.
Application name (appears in the URL). This is a customizable part of the Endpoint URL, defaults to the Endpoint ID if not specified.
Worker configuration. The valid range can be dynamically obtained via the parameter range API. Minimum number of workers.
Maximum number of workers.
Idle timeout (seconds). Unit: seconds.
Number of GPUs per worker.
HTTP ports. Only one is supported. Supported port range: 1-65535, except for 2222, 2223, and 2224 which are reserved for internal use. Auto-scaling policy. The valid range can be dynamically obtained via the parameter range API. Policy type. Options:
queue: Queue latency policy, adjusts the number of workers based on the waiting time of requests in the queue.
concurrency: Queue request policy, automatically adjusts the number of workers based on the number of requests in the queue.
The meaning of value depends on the type:
If type = queue, value is the queue waiting time in seconds.
If type = concurrency, value is the maximum number of requests in the queue.
Image information. Image URL. String, length limit: 0-511 characters.
Private image credential ID (not required for public images or platform user images). String, length limit: 0-255 characters.
Container startup command. String, length limit: 0-2047 characters.
System disk size (GB). Currently, set to a fixed value of 100.
Storage information (GB). Storage type. Options:
local: Local storage.
network: Network storage.
Local storage size, currently set to a fixed value of 30. Not required for network storage.
Network storage ID. Not required for local storage.
Mount path. String, length limit: 0-255 characters.
Cluster information. Required when mounting cloud storage, and must match the cluster ID where the cloud storage resides. String, length limit: 0-255 characters.
Environment variables. Environment variable name.
Environment variable value.
Health check endpoint. Path to be checked via HTTP request for health monitoring.