131072 Context
$0.390 / 1M input tokens
$0.390 / 1M output tokens
Demo
API
Model Configuration
Response format
System Prompt
max_tokens
temperature
top_p
min_p
top_k
presence_penalty
frequency_penalty
repetition_penalty
README

In the tapestry of Greek mythology, Hermes reigns as the eloquent Messenger of the Gods, a deity who deftly bridges the realms through the art of communication. It is in homage to this divine mediator that I name this advanced LLM "Hermes," a system crafted to navigate the complex intricacies of human discourse with celestial finesse.

Model description

OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.

Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant.

The code it trained on also improved it's humaneval score (benchmarking done by Glaive team) from 43% @ Pass 1 with Open Herms 2 to 50.7% @ Pass 1 with Open Hermes 2.5.

OpenHermes was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape. [More details soon]

Filtering was extensive of these public datasets, as well as conversion of all formats to ShareGPT, which was then further transformed by axolotl to use ChatML.

Huge thank you to GlaiveAI and a16z for compute access and for sponsoring my work, and all the dataset creators and other people who's work has contributed to this project!

Follow all my updates in ML and AI on Twitter: https://twitter.com/Teknium1

Support me on Github Sponsors: https://github.com/sponsors/teknium1

NEW: Chat with Hermes on LMSys' Chat Website! https://chat.lmsys.org/?single&model=openhermes-2.5-mistral-7b

How to use

You can choose 3 programming languages to access our teknium/openhermes-2.5-mistral-7b model.

HTTP/cURL

We provide compatibility with the OpenAI API standard

The API Base URL

1https://api.novita.ai/v3/openai

Example of Using Chat Completions API

Generate a response using a list of messages from a conversation

1# Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key 2export API_KEY="{YOUR Novita AI API Key}" 3 4curl "https://api.novita.ai/v3/openai/chat/completions" \ 5 -H "Content-Type: application/json" \ 6 -H "Authorization: Bearer ${API_KEY}" \ 7 -d '{ 8 "model": "teknium/openhermes-2.5-mistral-7b", 9 "messages": [ 10 { 11 "role": "system", 12 "content": "Act like you are a helpful assistant." 13 }, 14 { 15 "role": "user", 16 "content": "Hi there!" 17 } 18 ], 19 "max_tokens": 512 20}'

The response may look like this

1{ 2 "id": "chat-5f461a9a23a44ef29dbd3124b891afc0", 3 "object": "chat.completion", 4 "created": 1731584707, 5 "model": "teknium/openhermes-2.5-mistral-7b", 6 "choices": [ 7 { 8 "index": 0, 9 "message": { 10 "role": "assistant", 11 "content": "Hello! It's nice to meet you. How can I assist you today? Do you have any questions or topics you'd like to discuss? I'm here to help with anything you need." 12 }, 13 "finish_reason": "stop", 14 "content_filter_results": { 15 "hate": { "filtered": false }, 16 "self_harm": { "filtered": false }, 17 "sexual": { "filtered": false }, 18 "violence": { "filtered": false }, 19 "jailbreak": { "filtered": false, "detected": false }, 20 "profanity": { "filtered": false, "detected": false } 21 } 22 } 23 ], 24 "usage": { 25 "prompt_tokens": 46, 26 "completion_tokens": 40, 27 "total_tokens": 86, 28 "prompt_tokens_details": null, 29 "completion_tokens_details": null 30 }, 31 "system_fingerprint": "" 32}

If you want to receive a response via streaming, simply pass "stream": true in the request (see the difference on line 20). An example is provided.

1# Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key 2export API_KEY="{YOUR Novita AI API Key}" 3 4curl "https://api.novita.ai/v3/openai/chat/completions" \ 5 -H "Content-Type: application/json" \ 6 -H "Authorization: Bearer ${API_KEY}" \ 7 -d '{ 8 "model": "teknium/openhermes-2.5-mistral-7b", 9 "messages": [ 10 { 11 "role": "system", 12 "content": "Act like you are a helpful assistant." 13 }, 14 { 15 "role": "user", 16 "content": "Hi there!" 17 } 18 ], 19 "max_tokens": 512, 20 "stream": true 21}'

The response may look like this

1data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 2 3... 4 5data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"n, ne"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 6 7data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"ed"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 8 9data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":" assi"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 10 11data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"s"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 12 13data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"tan"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 14 15data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"ce wi"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 16 17... 18 19data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":" "},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 20 21data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"just want to chat?"},"finish_reason":"stop","content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 22 23data: [DONE]

Model Parameters

Feel free to check out our documentation for more details.

Python

First, install the official OpenAI Python client

1pip install 'openai>=1.0.0'

and then you can run inferences with us

Example of Using Chat Completions API

Generate a response using a list of messages from a conversation

1from openai import OpenAI 2 3client = OpenAI( 4 base_url="https://api.novita.ai/v3/openai", 5 # Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key. 6 api_key="<YOUR Novita AI API Key>", 7) 8 9model = "teknium/openhermes-2.5-mistral-7b" 10stream = True # or False 11max_tokens = 512 12 13chat_completion_res = client.chat.completions.create( 14 model=model, 15 messages=[ 16 { 17 "role": "system", 18 "content": "Act like you are a helpful assistant.", 19 }, 20 { 21 "role": "user", 22 "content": "Hi there!", 23 } 24 ], 25 stream=stream, 26 max_tokens=max_tokens, 27) 28 29if stream: 30 for chunk in chat_completion_res: 31 print(chunk.choices[0].delta.content or "") 32else: 33 print(chat_completion_res.choices[0].message.content)

If you set stream: true (line 10), the print may look like this

1It' 2s 3 ni 4ce to 5meet you. 6Is 7 the 8re so 9meth 10ing I 11 can h 12e 13lp 14you wi 15th t 16oday, 17 or 18 woul 19d 20 you like to chat?

If you don't want to receive a response via streaming, simply set stream: false. The output will look like this

1How can I assist you today? Do you have any questions or topics you'd like to discuss?

Model Parameters

Feel free to check out our documentation for more details.

JavaScript

First, install the official OpenAI JavaScript client

1npm install openai

and then you can run inferences with us in the browser or in node.js

Example of Using Chat Completions API

Generate a response using a list of messages from a conversation

1import OpenAI from "openai"; 2 3const openai = new OpenAI({ 4 baseURL: "https://api.novita.ai/v3/openai", 5 apiKey: "<YOUR Novita AI API Key>", 6}); 7const stream = true; // or false 8 9async function run() { 10 const completion = await openai.chat.completions.create({ 11 messages: [ 12 { 13 role: "system", 14 content: "Act like you are a helpful assistant.", 15 }, 16 { 17 role: "user", 18 content: "Hi there!" 19 } 20 ], 21 model: "teknium/openhermes-2.5-mistral-7b", 22 stream 23 }); 24 25 if (stream) { 26 for await (const chunk of completion) { 27 if (chunk.choices[0].finish_reason) { 28 console.log(chunk.choices[0].finish_reason); 29 } else { 30 console.log(chunk.choices[0].delta.content); 31 } 32 } 33 } else { 34 console.log(JSON.stringify(completion)); 35 } 36} 37 38run();

If you set stream: true (line 7), the print may look like this

1It' 2s 3 nic 4e to 5 m 6eet you 7. Ho 8w can 9I 10 as 11sist 12 you 13toda 14y? Do you 15hav 16e any q 17uest 18io 19ns or 20 to 21pics you 22' 23d 24li 25ke to 26 di 27scuss 28stop

If you don't want to receive a response via streaming, simply set stream: false. The output will look like this

1{ 2 "id": "chat-a3ff0e39b4c24abcbd258ab1a1f38db9", 3 "object": "chat.completion", 4 "created": 1731642457, 5 "model": "teknium/openhermes-2.5-mistral-7b", 6 "choices": [ 7 { 8 "index": 0, 9 "message": { 10 "role": "assistant", 11 "content": "How can I help you today? Would you like to talk about something specific or just have a chat? I'm here to assist you with any questions or information you might need." 12 }, 13 "finish_reason": "stop", 14 "content_filter_results": { 15 "hate": { "filtered": false }, 16 "self_harm": { "filtered": false }, 17 "sexual": { "filtered": false }, 18 "violence": { "filtered": false }, 19 "jailbreak": { "filtered": false, "detected": false }, 20 "profanity": { "filtered": false, "detected": false } 21 } 22 } 23 ], 24 "usage": { 25 "prompt_tokens": 46, 26 "completion_tokens": 37, 27 "total_tokens": 83, 28 "prompt_tokens_details": null, 29 "completion_tokens_details": null 30 }, 31 "system_fingerprint": "" 32}

Model Parameters

Feel free to check out our documentation for more details.