Transcriber misses short single word responses
# support
m
I have a previous post https://discord.com/channels/1211482211119796234/1295378768570880081 that got marked as solved, but only 1 of 2 problems got solved. I still have the problem with the transcriber missing most of my responses containing short Swedish words like "ja" (yes) and "nej" (no). Longer responses works fine. I have tried deepgram, talkscriber and gladia with Swedish language, but they all have this problem. I tried the speaking plan I got from @Shubham Bajaj : "startSpeakingPlan": { "transcriptionEndpointingPlan": { "onPunctuationSeconds": 0.4 }, "waitSeconds": 0.4, "smartEndpointingEnabled": true } But this did not work. Example: In call id: 2c70c5c7-6257-4a8d-a2d8-556a325f481e Recording between 00:25 - 00:50 The AI asks if I'm ready I answer: "Ja" (Yes) ..pause.. "Ja" (Yes) ..pause.. "Ja det är jag" (Yes I am) ..pause.. "Jag är redo" (I am ready) And then the AI answers me back. The log only shows multiple "[CHECKPOINT] User speech possibly starting" and in the transcript only the last "Jag är redo" shows up in the transcript. It´s the same problem in most calls. Sometimes it picks up short "ja" answers from me, but most of the time it does not. Could there be some setting for deepgram that needs to be changed? I have tried setting endpointing to 10, 200, 300, 400, for both nova 2 and the enhanced model, but that did not help.
v
@Magnus I'm sorry to hear that you're still experiencing issues with the transcription of short Swedish words. It seems like the problem might be related to the sensitivity of the speech recognition models to short utterances, which can be a common issue. Since you've already tried adjusting the endpointing settings without success, I would recommend trying the Talkscriber Whisper model, as you mentioned that Deepgram does not support your language effectively. Whisper has been known to perform well with various languages, including Swedish. To ensure the best possible transcription for short words like "ja" and "nej," you might want to experiment with the following: 1. Adjust the silence detection settings to be more sensitive to short utterances. 2. Use a noise reduction feature to minimize background noise that could be affecting the recognition of short words. 3. If available, enable a setting specifically designed to improve the recognition of short phrases or words. If you're still facing difficulties after trying these suggestions, please let me know, and we can explore further options or settings that might help resolve this issue.
You can continue this conversation by mentioning me in the message.
s
@Magnus you have to play with endpointing number to make it succeed for the most of time. You can either use deepgram endpointing or Vapi's endpointing(prefered) "startSpeakingPlan": { "transcriptionEndpointingPlan": { "onPunctuationSeconds": 100 } }, "transcriber": { "endpointing": 100 }
You have to try different endpointing values to make it work for you.
4 Views