### Answer
It seems like the issue you're encountering might relate to how the response from your custom LLM is being streamed back to Vapi. For Vapi to properly receive and process responses from a custom LLM, including services like Mixtral, the response needs to be sent back in a manner that Vapi can understand and handle appropriately.
According to the documentation, Vapi supports using any OpenAI-compatible endpoint as the LLM. This includes the ability to set up your server to act as the LLM, which requires creating an endpoint compatible with the OpenAI Client and ideally supporting streaming completions [(1)](
) [(2)](
) . If your server is making calls to an OpenAI compatible API, you can pipe the requests directly back in your response to Vapi [(1)](
) .
Furthermore, it's important to ensure that your response format matches what Vapi expects. This might involve using SSE (Server-Sent Events) to stream the response back to Vapi if you're processing the responses asynchronously [(3)](
) .
If you're still facing issues despite following these guidelines, it could be beneficial to review the specific setup details of your custom LLM integration, ensuring all configurations are correctly applied as per the Vapi documentation. Additionally, checking the logs on both your custom LLM and Vapi's side might provide more insights into where the breakdown is occurring.