Hi VAPI Support Team,
I’m reaching out regarding issues we’re facing while using the VAPI custom-llm feature to enable voice as a channel for our existing text-based chatbot (which includes its own orchestration layer for handling text queries).
While we are successfully generating and streaming output chunks from our backend (via the chat/completions API hosted on our server), we’ve observed that the transcripts on the VAPI side are often incomplete. Although the full model output appears correctly in the VAPI call logs, the corresponding transcripts tend to get cut off mid-sentence, resulting in poor voice quality and frequent drops.
More specifically, we’ve encountered two recurring issues:
-TTS Not Triggering or Voice Input Missing
In some calls, the TTS does not get triggered and the voice input query itself does not seem to be generated.
Example call: a9cfa29f-52b4-472a-a0c0-6be82f07b9a4
-Incomplete Assistant Response in Transcripts
In other cases, while the full voice input is correctly logged, the assistant's response in the transcript appears truncated.
Example call: 440410f2-0acc-4b63-90a4-7acf29d75787
From the logs, we can see the voice output after TTS:
VOICE INPUT: Of course, I’d be happy to help you update your order! To get started,
VOICE INPUT: would you like to use the same phone number you’re calling from to look up your order?
But the assistant transcript shows:
"Of course, I'd be happy to help you. Update your order. Get started, would you like to use the same phone number..."
We (LevelAI) were seriously considering VAPI as a platform to power our voice integration, but the current behavior is significantly impacting user experience. We’d appreciate your help in investigating these issues and letting us know if there’s a way to address them.
Looking forward to your support.
Thanks,
Arshdeep Singh
LevelAI