This document covers advanced features and best practices for Novita Agent Runtime.
Table of Contents
Configuration File Reference
.novita-agent.yaml Structure
The .novita-agent.yaml configuration file uses Kubernetes-style YAML format:
apiVersion: v1
kind: Agent
metadata:
name: my-agent # Agent name (must consist of lowercase letters, numbers, and hyphens only)
version: 1.0.0 # Agent version (semantic versioning)
author: dev@example.com # Author email (required)
description: My AI Agent # Agent description (optional)
created: '2025-10-23T10:30:00Z' # Creation time (auto-generated)
spec:
entrypoint: app.py # Python entry file (must be .py file)
# Environment variables configuration (optional)
envVars:
MODEL_NAME: deepseek/deepseek-v3-0324
TEMPERATURE: '0.7'
# Runtime configuration (optional, applied to the built sandbox template)
runtime:
timeout: 300 # Startup timeout in seconds (1-3600, default 300)
memory_limit: 1Gi # Memory limit (supports "512Mi", "1Gi", etc.)
cpu_limit: '1' # CPU limit (supports "1", "1000m", etc.)
# Status field (maintained by the system, should not be modified manually by users)
status:
phase: deployed # Current phase: pending/building/deployed/failed
agent_id: agent-xxxxx # Agent ID (auto-generated after deployment)
last_deployed: '2025-10-23T10:35:00Z' # Last deployment time
build_id: build_xyz789 # Build ID (auto-generated after deployment)
Modifying Configuration
Modifying CPU and Memory Settings
Modify resource configuration under spec.runtime in .novita-agent.yaml:
spec:
runtime:
# CPU configuration
cpu_limit: '2' # 2 CPU cores
# Memory configuration
memory_limit: 2Gi # 2 GB memory
Modifying Environment Variables
The spec.envVars in .novita-agent.yaml is only used for the CLI’s agent invoke command and will not be passed to the deployed sandbox template.
Modify environment variables under spec.envVars in .novita-agent.yaml:
spec:
envVars:
# LLM configuration
MODEL_NAME: deepseek/deepseek-v3-0324
TEMPERATURE: '0.7'
Note:
- ⚠️ Do not store sensitive information (such as API Keys) in
.novita-agent.yaml
- You can also pass environment variables via the
--env parameter when running the agent invoke command
Redeploy to Apply Configuration Changes
After modifying resource specifications in .novita-agent.yaml, redeploy is required:
# Redeploy (creates a new version)
npx novita-sandbox-cli agent launch
Environment Variables Management
There are several ways to pass environment variables to Agents running in sandbox instances:
Method 1: Define in Configuration File (CLI invocation only)
Define environment variables under spec.envVars in .novita-agent.yaml:
spec:
envVars:
MODEL_NAME: deepseek/deepseek-v3-0324
TEMPERATURE: '0.7'
Method 2: Pass Dynamically via SDK
When invoking an Agent using the SDK’s invoke_agent_runtime method, pass them dynamically via the envVars parameter:
import os
from novita_sandbox.agent_runtime import AgentRuntimeClient
client = AgentRuntimeClient(api_key=os.getenv("NOVITA_API_KEY"))
response = await client.invoke_agent_runtime(
agentId="agent-xxxxx",
payload=payload,
envVars={
# Read sensitive information from environment variables
"NOVITA_API_KEY": os.getenv("NOVITA_API_KEY"),
"DATABASE_PASSWORD": os.getenv("DATABASE_PASSWORD"),
# Or pass directly
"MODEL_NAME": "deepseek/deepseek-v3-0324",
"TEMPERATURE": "0.7"
}
)
Streaming Responses
Implementing Streaming with Synchronous Generators
Use Python generators to implement streaming responses:
from novita_sandbox.agent_runtime import AgentRuntimeApp
app = AgentRuntimeApp()
@app.entrypoint
def streaming_agent(request: dict):
"""Synchronous streaming response"""
prompt = request.get("prompt", "")
# Use generator to return chunks
for i, chunk in enumerate(generate_response(prompt)):
yield {
"chunk": chunk,
"type": "content",
"index": i
}
# Send end marker
yield {"chunk": "", "type": "end"}
Implementing Streaming with Async Generators
Use Python async generators:
import asyncio
@app.entrypoint
async def async_streaming_agent(request: dict):
"""Async streaming response"""
prompt = request.get("prompt", "")
async for chunk in async_generate_response(prompt):
yield {
"chunk": chunk,
"type": "content"
}
yield {"chunk": "", "type": "end"}
LangChain Streaming Response Example
Complete example using LangChain for streaming responses:
import os
from langchain_openai import ChatOpenAI
from langchain.callbacks.base import BaseCallbackHandler
from novita_sandbox.agent_runtime import AgentRuntimeApp
app = AgentRuntimeApp()
class StreamingHandler(BaseCallbackHandler):
"""Streaming callback handler"""
def __init__(self):
self.tokens = []
def on_llm_new_token(self, token: str, **kwargs):
self.tokens.append(token)
@app.entrypoint
def langchain_streaming_agent(request: dict):
"""LangChain streaming response"""
prompt = request.get("prompt", "")
# Create streaming-enabled LLM
llm = ChatOpenAI(
api_key=os.getenv("NOVITA_API_KEY"),
streaming=True
)
# Stream invocation
for chunk in llm.stream(prompt):
if chunk.content:
yield {
"chunk": chunk.content,
"type": "content"
}
yield {"chunk": "", "type": "end"}
Invoking a Streaming Agent
Invoke a streaming Agent using the SDK:
import asyncio
import json
import os
from novita_sandbox.agent_runtime import AgentRuntimeClient
async def call_streaming_agent():
client = AgentRuntimeClient(api_key=os.getenv("NOVITA_API_KEY"))
payload = json.dumps({
"prompt": "Tell me a story"
}).encode()
response = await client.invoke_agent_runtime(
agentId="agent-xxxxx",
payload=payload
)
# Process streaming response
print("Streaming response:")
print(response)
Version Management
Deploying a New Agent Version
Modify the version number and deploy a new version:
# Modify version number
npx novita-sandbox-cli agent configure --agent-version 1.1.0
# Deploy new version
npx novita-sandbox-cli agent launch
After successful deployment, a new agent_id is generated. Each deployment generates a unique agent_id that corresponds to a specific version.
Health Checks
Default Health Check Endpoint
AgentRuntimeApp automatically provides a /ping health check endpoint:
from novita_sandbox.agent_runtime import AgentRuntimeApp
app = AgentRuntimeApp()
# Default health check automatically responds with {"status": "Healthy"}
Custom Health Checks
Use the @app.ping decorator to customize health check logic:
@app.ping
def custom_health_check():
"""Custom health check"""
# Check dependent services
db_ok = check_database_connection()
llm_ok = check_llm_service()
if db_ok and llm_ok:
return {"status": "Healthy"}
elif db_ok or llm_ok:
return {"status": "HealthyBusy"} # Partially available
else:
return {"status": "Unhealthy"} # Unavailable
def check_database_connection():
"""Check database connection"""
try:
# Simulate database check
return True
except:
return False
def check_llm_service():
"""Check LLM service"""
try:
# Simulate LLM service check
return True
except:
return False
Supported Health Check Statuses
Agents can return the following health statuses:
| Status | Description | HTTP Status Code |
|---|
Healthy | Agent is fully available | 200 |
HealthyBusy | Agent is partially available (e.g., processing heavy load) | 200 |
Unhealthy | Agent is unavailable | 503 |
Multi-turn Conversations
Using Session ID for Multi-turn Conversations
Use the runtimeSessionId parameter to route multiple requests to the same sandbox instance:
import uuid
import json
import os
from novita_sandbox.agent_runtime import AgentRuntimeClient
async def multi_turn_conversation():
runtime_session_id = str(uuid.uuid4())
client = AgentRuntimeClient(api_key=os.getenv("NOVITA_API_KEY"))
agent_id = "agent-xxxxx"
# First turn
response1 = await client.invoke_agent_runtime(
agentId=agent_id,
payload=json.dumps({"prompt": "Hello, my name is John"}).encode(),
runtimeSessionId=runtime_session_id,
)
print(f"AI: {response1}")
# Second turn (sent to the same sandbox instance, Agent remembers the context)
response2 = await client.invoke_agent_runtime(
agentId=agent_id,
payload=json.dumps({"prompt": "What's my name?"}).encode(),
runtimeSessionId=runtime_session_id,
)
print(f"AI: {response2}") # Should answer "John"
Example Projects
We provide a complete example project based on LangGraph, demonstrating how to build real AI applications with Novita Agent Runtime.
Project Repository
🔗 https://github.com/novitalabs/Novita-CollabHub/tree/main/examples/agent-runtime/agentic-frameworks/langgraph Last modified on December 2, 2025