custom llm response
# support
i am using mixtral model to generate desired outputs. i am integrating my custom llm using the vapi interface, when i am calling the agent, my llm recieve the requesst by vapi, but the response i am sending does not reach the vapi interfae i.e it doesnt say anything and the call ends. please tell me how to cater this problem
Hi @nabeel1956, I'm an AI assistant for Vapi. I'll analyze available sources and get back to you shortly.
----------------------------- **While I generate an answer, here are some sources that might be helpful**: * [Custom LLM]() * [Frequently Asked Questions]() * [Prompting Guide]() * [Introduction]() -----------------------------
### Answer It seems like the issue you're encountering might relate to how the response from your custom LLM is being streamed back to Vapi. For Vapi to properly receive and process responses from a custom LLM, including services like Mixtral, the response needs to be sent back in a manner that Vapi can understand and handle appropriately. According to the documentation, Vapi supports using any OpenAI-compatible endpoint as the LLM. This includes the ability to set up your server to act as the LLM, which requires creating an endpoint compatible with the OpenAI Client and ideally supporting streaming completions [(1)]() [(2)]() . If your server is making calls to an OpenAI compatible API, you can pipe the requests directly back in your response to Vapi [(1)]() . Furthermore, it's important to ensure that your response format matches what Vapi expects. This might involve using SSE (Server-Sent Events) to stream the response back to Vapi if you're processing the responses asynchronously [(3)]() . If you're still facing issues despite following these guidelines, it could be beneficial to review the specific setup details of your custom LLM integration, ensuring all configurations are correctly applied as per the Vapi documentation. Additionally, checking the logs on both your custom LLM and Vapi's side might provide more insights into where the breakdown is occurring.
Can you share your call_id or anything so that I can look into it?
i am ending response from my llm in the following format: { "choices": [ { "finish_reason": "stop", "index": 0, "message": { "content": resp, "role": "assistant" }, "logprobs": "null" } ], "created": 1677664795, "id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW", "model": "gpt-3.5-turbo-0613", "object": "chat.completion", "usage": { "completion_tokens": 17, "prompt_tokens": 57, "total_tokens": 74 } }
is it correct or do i need to hcange it. THanks for your help
@nabeel1956 Did you add your OpenAI key in CustomLLM field which is in the provider key section?
hey @Sahil can you please share the code with me for using my custom llm
how should I set customLLM/chat/completions for send response?
please checkout the custom llm guide