In the tapestry of Greek mythology, Hermes reigns as the eloquent Messenger of the Gods, a deity who deftly bridges the realms through the art of communication. It is in homage to this divine mediator that I name this advanced LLM "Hermes," a system crafted to navigate the complex intricacies of human discourse with celestial finesse.
OpenHermes 2.5 Mistral 7B is a state of the art Mistral Fine-tune, a continuation of OpenHermes 2 model, which trained on additional code datasets.
Potentially the most interesting finding from training on a good ratio (est. of around 7-14% of the total dataset) of code instruction was that it has boosted several non-code benchmarks, including TruthfulQA, AGIEval, and GPT4All suite. It did however reduce BigBench benchmark score, but the net gain overall is significant.
The code it trained on also improved it's humaneval score (benchmarking done by Glaive team) from 43% @ Pass 1 with Open Herms 2 to 50.7% @ Pass 1 with Open Hermes 2.5.
OpenHermes was trained on 1,000,000 entries of primarily GPT-4 generated data, as well as other high quality data from open datasets across the AI landscape. [More details soon]
Filtering was extensive of these public datasets, as well as conversion of all formats to ShareGPT, which was then further transformed by axolotl to use ChatML.
Huge thank you to GlaiveAI and a16z for compute access and for sponsoring my work, and all the dataset creators and other people who's work has contributed to this project!
Follow all my updates in ML and AI on Twitter: https://twitter.com/Teknium1
Support me on Github Sponsors: https://github.com/sponsors/teknium1
NEW: Chat with Hermes on LMSys' Chat Website! https://chat.lmsys.org/?single&model=openhermes-2.5-mistral-7b
You can choose 3 programming languages to access our teknium/openhermes-2.5-mistral-7b model.
We provide compatibility with the OpenAI API standard
The API Base URL
1https://api.novita.ai/v3/openai
Example of Using Chat Completions API
Generate a response using a list of messages from a conversation
1# Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key 2export API_KEY="{YOUR Novita AI API Key}" 3 4curl "https://api.novita.ai/v3/openai/chat/completions" \ 5 -H "Content-Type: application/json" \ 6 -H "Authorization: Bearer ${API_KEY}" \ 7 -d '{ 8 "model": "teknium/openhermes-2.5-mistral-7b", 9 "messages": [ 10 { 11 "role": "system", 12 "content": "Act like you are a helpful assistant." 13 }, 14 { 15 "role": "user", 16 "content": "Hi there!" 17 } 18 ], 19 "max_tokens": 512 20}'
The response may look like this
1{ 2 "id": "chat-5f461a9a23a44ef29dbd3124b891afc0", 3 "object": "chat.completion", 4 "created": 1731584707, 5 "model": "teknium/openhermes-2.5-mistral-7b", 6 "choices": [ 7 { 8 "index": 0, 9 "message": { 10 "role": "assistant", 11 "content": "Hello! It's nice to meet you. How can I assist you today? Do you have any questions or topics you'd like to discuss? I'm here to help with anything you need." 12 }, 13 "finish_reason": "stop", 14 "content_filter_results": { 15 "hate": { "filtered": false }, 16 "self_harm": { "filtered": false }, 17 "sexual": { "filtered": false }, 18 "violence": { "filtered": false }, 19 "jailbreak": { "filtered": false, "detected": false }, 20 "profanity": { "filtered": false, "detected": false } 21 } 22 } 23 ], 24 "usage": { 25 "prompt_tokens": 46, 26 "completion_tokens": 40, 27 "total_tokens": 86, 28 "prompt_tokens_details": null, 29 "completion_tokens_details": null 30 }, 31 "system_fingerprint": "" 32}
If you want to receive a response via streaming, simply pass "stream": true
in the request (see the difference on line 20). An example is provided.
1# Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key 2export API_KEY="{YOUR Novita AI API Key}" 3 4curl "https://api.novita.ai/v3/openai/chat/completions" \ 5 -H "Content-Type: application/json" \ 6 -H "Authorization: Bearer ${API_KEY}" \ 7 -d '{ 8 "model": "teknium/openhermes-2.5-mistral-7b", 9 "messages": [ 10 { 11 "role": "system", 12 "content": "Act like you are a helpful assistant." 13 }, 14 { 15 "role": "user", 16 "content": "Hi there!" 17 } 18 ], 19 "max_tokens": 512, 20 "stream": true 21}'
The response may look like this
1data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 2 3... 4 5data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"n, ne"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 6 7data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"ed"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 8 9data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":" assi"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 10 11data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"s"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 12 13data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"tan"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 14 15data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"ce wi"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 16 17... 18 19data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":" "},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 20 21data: {"id":"chat-d821b951d6ff43ab838d18137aef7d0a","object":"chat.completion.chunk","created":1731586102,"model":"meta-llama/llama-3.1-8b-instruct","choices":[{"index":0,"delta":{"content":"just want to chat?"},"finish_reason":"stop","content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false},"jailbreak":{"filtered":false,"detected":false},"profanity":{"filtered":false,"detected":false}}}],"system_fingerprint":""} 22 23data: [DONE]
Model Parameters
Feel free to check out our documentation for more details.
First, install the official OpenAI Python client
1pip install 'openai>=1.0.0'
and then you can run inferences with us
Example of Using Chat Completions API
Generate a response using a list of messages from a conversation
1from openai import OpenAI 2 3client = OpenAI( 4 base_url="https://api.novita.ai/v3/openai", 5 # Get the Novita AI API Key by referring to: https://novita.ai/docs/get-started/quickstart.html#_2-manage-api-key. 6 api_key="<YOUR Novita AI API Key>", 7) 8 9model = "teknium/openhermes-2.5-mistral-7b" 10stream = True # or False 11max_tokens = 512 12 13chat_completion_res = client.chat.completions.create( 14 model=model, 15 messages=[ 16 { 17 "role": "system", 18 "content": "Act like you are a helpful assistant.", 19 }, 20 { 21 "role": "user", 22 "content": "Hi there!", 23 } 24 ], 25 stream=stream, 26 max_tokens=max_tokens, 27) 28 29if stream: 30 for chunk in chat_completion_res: 31 print(chunk.choices[0].delta.content or "") 32else: 33 print(chat_completion_res.choices[0].message.content)
If you set stream: true
(line 10), the print may look like this
1It' 2s 3 ni 4ce to 5meet you. 6Is 7 the 8re so 9meth 10ing I 11 can h 12e 13lp 14you wi 15th t 16oday, 17 or 18 woul 19d 20 you like to chat?
If you don't want to receive a response via streaming, simply set stream: false
. The output will look like this
1How can I assist you today? Do you have any questions or topics you'd like to discuss?
Model Parameters
Feel free to check out our documentation for more details.
First, install the official OpenAI JavaScript client
1npm install openai
and then you can run inferences with us in the browser or in node.js
Example of Using Chat Completions API
Generate a response using a list of messages from a conversation
1import OpenAI from "openai"; 2 3const openai = new OpenAI({ 4 baseURL: "https://api.novita.ai/v3/openai", 5 apiKey: "<YOUR Novita AI API Key>", 6}); 7const stream = true; // or false 8 9async function run() { 10 const completion = await openai.chat.completions.create({ 11 messages: [ 12 { 13 role: "system", 14 content: "Act like you are a helpful assistant.", 15 }, 16 { 17 role: "user", 18 content: "Hi there!" 19 } 20 ], 21 model: "teknium/openhermes-2.5-mistral-7b", 22 stream 23 }); 24 25 if (stream) { 26 for await (const chunk of completion) { 27 if (chunk.choices[0].finish_reason) { 28 console.log(chunk.choices[0].finish_reason); 29 } else { 30 console.log(chunk.choices[0].delta.content); 31 } 32 } 33 } else { 34 console.log(JSON.stringify(completion)); 35 } 36} 37 38run();
If you set stream: true
(line 7), the print may look like this
1It' 2s 3 nic 4e to 5 m 6eet you 7. Ho 8w can 9I 10 as 11sist 12 you 13toda 14y? Do you 15hav 16e any q 17uest 18io 19ns or 20 to 21pics you 22' 23d 24li 25ke to 26 di 27scuss 28stop
If you don't want to receive a response via streaming, simply set stream: false
. The output will look like this
1{ 2 "id": "chat-a3ff0e39b4c24abcbd258ab1a1f38db9", 3 "object": "chat.completion", 4 "created": 1731642457, 5 "model": "teknium/openhermes-2.5-mistral-7b", 6 "choices": [ 7 { 8 "index": 0, 9 "message": { 10 "role": "assistant", 11 "content": "How can I help you today? Would you like to talk about something specific or just have a chat? I'm here to assist you with any questions or information you might need." 12 }, 13 "finish_reason": "stop", 14 "content_filter_results": { 15 "hate": { "filtered": false }, 16 "self_harm": { "filtered": false }, 17 "sexual": { "filtered": false }, 18 "violence": { "filtered": false }, 19 "jailbreak": { "filtered": false, "detected": false }, 20 "profanity": { "filtered": false, "detected": false } 21 } 22 } 23 ], 24 "usage": { 25 "prompt_tokens": 46, 26 "completion_tokens": 37, 27 "total_tokens": 83, 28 "prompt_tokens_details": null, 29 "completion_tokens_details": null 30 }, 31 "system_fingerprint": "" 32}
Model Parameters
Feel free to check out our documentation for more details.