Skip to main content
This guide shows how to integrate Novita AI with MLflow Tracing. By using Novita AI’s OpenAI-compatible endpoint (https://api.novita.ai/openai), you can capture prompts, responses, latency, token usage, and model metadata in MLflow.
MLflow trace details and timeline

Prerequisites

Before you start, make sure you have:
  • Novita AI API key: create one in Key Management.
  • A running MLflow tracking server. You can use local default http://localhost:5000.
  • Python or JavaScript runtime.

Integration Steps

Step 1: Install Dependencies

pip install 'mlflow[genai]' openai

Step 2: Start MLflow Server

If you have a local Python environment >= 3.10, you can start MLflow with:
mlflow server
MLflow also provides a Docker Compose setup:
git clone --depth 1 --filter=blob:none --sparse https://github.com/mlflow/mlflow.git
cd mlflow
git sparse-checkout set docker-compose
cd docker-compose
cp .env.dev.example .env
docker compose up -d
Then open http://localhost:5000 to confirm the MLflow UI is accessible.

Step 3: Enable Tracing and Call Novita AI

import openai
import mlflow

# Enable auto-tracing for OpenAI-compatible calls
mlflow.openai.autolog()

# Optional: set tracking target and experiment
mlflow.set_tracking_uri("http://localhost:5000")
mlflow.set_experiment("Novita AI")

client = openai.OpenAI(
    base_url="https://api.novita.ai/openai",
    api_key="<your_novita_api_key>",
)

response = client.chat.completions.create(
    model="deepseek/deepseek-r1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"},
    ],
)

print(response.choices[0].message.content)

Step 4: View Traces in MLflow UI

Open your MLflow UI (for example http://localhost:5000) and go to your configured experiment to inspect traces. You should see:
  • Prompt and completion content
  • Latency and token usage
  • Model and request metadata
  • Errors/exceptions (if any)

Step 5: Advanced Tracing References

Streaming and Async

MLflow supports tracing for streaming and async Novita AI APIs. See:

Combine with frameworks or manual tracing

import json
from openai import OpenAI
import mlflow
from mlflow.entities import SpanType

# Initialize the OpenAI client with Novita AI API endpoint
client = OpenAI(
    base_url="https://api.novita.ai/openai",
    api_key="<your_novita_api_key>",
)


# Create a parent span for the Novita AI call
@mlflow.trace(span_type=SpanType.CHAIN)
def answer_question(question: str):
    messages = [{"role": "user", "content": question}]
    response = client.chat.completions.create(
        model="deepseek/deepseek-r1",
        messages=messages,
    )

    # Attach session/user metadata to the trace
    mlflow.update_current_trace(
        metadata={
            "mlflow.trace.session": "session-12345",
            "mlflow.trace.user": "user-a",
        }
    )
    return response.choices[0].message.content


answer = answer_question("What is the capital of France?")
For full upstream reference: For Novita model details and endpoint usage, see: