Here are some frequently asked questions about Novita AI. Before contacting our support team, please check the FAQs below to help you quickly find solutions.
4. Why can’t the instance be restarted after it stops?
After the instance stops, the resources belonging to the instance may have been preempted. In this case, it is recommended to first save the image based on the target instance, and then create a new instance based on the saved image before.
After saving the instance image, the data on the container disk will be saved with the image, but the data on the volume disk will not. It is recommended to use the network volume for data with high persistence requirements.
First, try to troubleshoot the problem through the “System Logs” and “Instance Logs” of the instance. If the problem cannot be resolved, you can contact us.
6. No instance specifications with a specified CUDA version.
CUDA versions are backward compatible. For example, if your service relies on CUDA version 12.1, you can choose an instance specification with a CUDA version greater than or equal to 12.1.
First, try to troubleshoot the problem through the logs of the “Save Image” task. If you are saving the image to a private repository address, please check whether your Container Registry Auth Configuration is correct. If the problem cannot be resolved, you can contact us.
Due to the PID isolation of Docker containers, the nvidia-smi command cannot be used to view the process. You can install the py3nvml library and use the shell command to check the GPU usage:
Copy
Ask AI
# Install the py3nvml library.$ pip install py3nvml# Check the GPU usage.$ py3smiFri Sep 20 12:17:39 2024+-----------------------------------------------------------------------------+| NVIDIA-SMI Driver Version: 550.54.14 |+---------------------------------+---------------------+---------------------+| GPU Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |+=================================+=====================+=====================+| 5 35% 28C 8 11W / 450W | 353MiB / 24564MiB | 0% Default |+---------------------------------+---------------------+---------------------++-----------------------------------------------------------------------------+| Processes: GPU Memory || GPU Owner PID Uptime Process Name Usage |+=============================================================================++-----------------------------------------------------------------------------+