Vapi send partial transcription to custom LLM Vapi AI #support

Vapi send partial transcription to custom LLM

Marco Falcone

02/04/2025, 3:39 PM

Hi, I open a new ticket on this topic. In my assistant ad661bd8-d418-4408-ba01-c2d4825740d4 I receive multiple time the same input with incremental transcription. For example in call 259bcdcb-9203-4a44-bb89-13c388a2452d my server receive the attached JSON bodies. As you can see both inputs have the same number of messages, but the last message of the first request has as content a substring of the last message of the second request. Is there a way to remove this behaviour or to understand if the first input is just a partial message? Thanks for your support https://cdn.discordapp.com/attachments/1336360452208332831/1336360452418175060/first.json?ex=67a38628&is=67a234a8&hm=c6795dd4963812c7b2c3cd010f2b8469bef5d699d7921af995a1ea12c0d97b56& https://cdn.discordapp.com/attachments/1336360452208332831/1336360452938010654/second.json?ex=67a38628&is=67a234a8&hm=43a78db1f2bafccacaa7133206bba42848864e1aa52b1e947ae9831727347d70&

Vapi

02/04/2025, 3:42 PM

message has been deleted

Vapi

02/04/2025, 3:42 PM

@Marco Falcone Hello! It sounds like you're experiencing an issue with incremental transcriptions where partial messages are being sent multiple times, with each subsequent message containing more of the complete sentence. To address this, you can utilize the

assistant.voice.inputPunctuationBoundaries

setting to help determine the end of a complete message. When the voice input contains punctuation that is listed in the

inputPunctuationBoundaries

, it is considered a valid boundary, indicating the end of a complete message. If you're receiving multiple inputs with incremental transcriptions, you can check if the last message ends with one of the punctuation marks from the

inputPunctuationBoundaries

. If it does not, it's likely that the message is still partial and you may receive a more complete version in a subsequent input. Would you like to adjust the

inputPunctuationBoundaries

to better suit your needs, or do you need further assistance with handling the incremental transcriptions?

Vapi

02/04/2025, 3:42 PM

You can continue this conversation by mentioning me in the message.

Marco Falcone

02/04/2025, 4:10 PM

@Vapi there is no punctuation in received input

Vapi

02/04/2025, 4:10 PM

@Marco Falcone I understand that the lack of punctuation in the received input makes it challenging to determine the completeness of the message. In this case, you might consider implementing a timeout mechanism on your server to wait for a short period after receiving a message to see if a more complete message arrives. Alternatively, if the incremental transcription is not necessary for your use case, you could check if there's a setting in your voice assistant's configuration to disable incremental transcriptions altogether. If these options are not suitable or if you need more specific guidance on how to implement them, please let me know, and I'll be happy to assist you further.

Marco Falcone

02/05/2025, 9:31 AM

Ho can I disable incremental transcription with deepgram on Vapi?

Shubham Bajaj

02/05/2025, 8:27 PM

@Marco Falcone Looking into it.

Marco Falcone

02/10/2025, 9:17 AM

Hi @Shubham Bajaj , any news?

Vapi

02/10/2025, 9:18 AM

Hey Marco, it is to be expected as we send the request when we think that user has stopped speaking. What you can do it increase the waitSecond to 0.8 it will significantly reduce this issue.

Marco Falcone

02/10/2025, 9:36 AM

Hi @Sahil I already tried this solution but it doesn't solve the issue. Is there a way to recognize if the input message is partial or complete? This should solve the problem

Vapi

02/10/2025, 9:37 AM

Unfortunately, for now there is no way to recognize it.

Shubham Bajaj

02/10/2025, 10:58 AM

Hey @Marco Falcone we have checked this carefully and it's a pure endpointing problem. So you either need to increase the wait time-out seconds or using custom endpointing. It is not related to partial transcripts.

Shubham Bajaj

02/10/2025, 10:59 AM

As @Sahil has mentioned you can use 0.8 a good median number for waitSeconds or use custom endpointing which works on phrases based rules.

Marco Falcone

02/10/2025, 11:00 AM

Can you provide me a link to the documentation for the custom endpointing?

Shubham Bajaj

02/10/2025, 11:06 AM

@Marco Falcone For now it's not in docs but if you find it usefull will push it to the documentation https://gist.github.com/bajajcodes/cd664c81344ef9f27a4d304de43bd313

2 Views

Previous Next