Novita AI provides comprehensive monitoring metrics for the usage of large language model (LLM) APIs. These metrics help you gain deep insights into the availability and performance of your LLM API requests. You can view the monitoring metrics on the LLM Monitoring Page.Documentation Index
Fetch the complete documentation index at: https://novita.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Metric Descriptions
All metrics are broken down by model and sampled at the minute level. However, depending on your selected time interval, samples may not be shown for every minute. In such cases, values will be averaged across the selected time range.
- Requests Per Minute (RPM)
- Request Success Rate
- Average Token Count per Request
- End-to-End (E2E) Latency
- Time to First Token (TTFT)
Only tracked for streaming requests where stream=true.
- Time Per Output Token (TPOT)
Only tracked for streaming requests where stream=true.