FAQ of GPU Instance
FAQ of GPU Instance
1. How to check the price of the GPU instances?
You can check the price of GPU instances and their configurations (container disk, volume disk, network volume, etc.) on the .
2. When does the billing for GPU instance start?
Billing starts when the instance status changes to "Pulling" status.
3. Introduction of container disk, volume disk, and network volume.
Container Disk
- Does not support dynamic expansion, can only specify capacity when creating an instance;
- Mount directory:
/
(cannot be customized); - Data will be saved when saving the image;
- Supports 60GB free quota, charges will apply for the excess part, for details refer to: .
Volume Disk
- Supports dynamic expansion;
- Default mount directory:
/workspace
(customizable); - Data will not be saved when saving the image;
- Read and write speed is the same as the container disk;
- Supports 30GB free quota, charges will apply for the excess part, for details refer to: .
Network Volume
- Supports dynamic expansion;
- Default mount directory:
/network
(customizable); - Network volume has an independent lifecycle, unrelated to the instance, even if the instance is deleted, the network volume data still exists;
- Overall read and write speed is slower than the container disk or volume disk (depending on specific usage);
- Network volume capacity requires additional charges, for details refer to: .
4. Why can't the instance be restarted after it stops?
After the instance stops, the resources belonging to the instance may have been preempted. In this case, it is recommended to first based on the target instance, and then create a new instance based on the saved image before.
Note
After saving the instance image, the data on the container disk will be saved with the image, but the data on the volume disk will not. It is recommended to use the for data with high persistence requirements.
5. How to handle abnormal instance status?
First, try to troubleshoot the problem through the "System Logs" and "Instance Logs" of the instance. If the problem cannot be resolved, you can .
6. No instance specifications with a specified CUDA version.
CUDA versions are backward compatible. For example, if your service relies on CUDA version 12.1, you can choose an instance specification with a CUDA version greater than or equal to 12.1.
7. What is the maximum CUDA version supported by the platform?
You can check the allowed CUDA versions in the "Filter" module at the bottom right corner of the .
8. How to diagnose the "Save Image" failure?
First, try to troubleshoot the problem through the logs of the task. If you are saving the image to a private repository address, please check whether your is correct. If the problem cannot be resolved, you can .
9. Can dedicated IP be supported?
Yes. Currently, this capability is not open to the public. If you have such requirements, please .
10. How to check the GPU usage of the instance?
Due to the PID isolation of Docker containers, the nvidia-smi
command cannot be used to view the process. You can install the py3nvml
library and use the shell command to check the GPU usage:
# Install the py3nvml library.
$ pip install py3nvml
# Check the GPU usage.
$ py3smi
Fri Sep 20 12:17:39 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI Driver Version: 550.54.14 |
+---------------------------------+---------------------+---------------------+
| GPU Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
+=================================+=====================+=====================+
| 5 35% 28C 8 11W / 450W | 353MiB / 24564MiB | 0% Default |
+---------------------------------+---------------------+---------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU Owner PID Uptime Process Name Usage |
+=============================================================================+
+-----------------------------------------------------------------------------+