No response after initial message
# support
b
v
Hi there! I see you're having an issue with no response after the initial message in your Vapi Voice AI calls 🔍 This is typically related to how the assistant handles idle time after delivering the initial message. Based on the documentation, here's what might be happening: The assistant plays the initial message but doesn't follow up if no user response is detected. This can be fixed by implementing proper idle message handling: 1. Check your idle message configuration - the default timeout is 8 seconds, after which a follow-up message should be sent if no response is detected 2. Make sure you have idle messages configured in your assistant settings You can update your assistant's configuration with something like this:
Copy code
bash
curl -X PATCH https://api.vapi.ai/assistant/id \
-H "Authorization: Bearer token" \
-H "Content-Type: application/json" \
-d '{
"messagePlan": {
  "idleMessages": [
    "Are you still there?",
    "Hello, can you hear me?"
  ],
  "idleMessageMaxSpokenCount": 3,
  "idleTimeoutSeconds": 8
}
}'
If the issue persists, try these additional troubleshooting steps: - Review any recent changes to your assistant configuration - Verify your completion message settings - Check that your response handling logic is working correctly The recording interface you shared shows a 20-second recording, which suggests the call is connecting but possibly not handling the conversation flow correctly. Source: - [Troubleshooting Vapi's Voice AI Platform for Issues with No Response After Initial Message](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1333747605628391434)
b
This is the current message plan
Copy code
"messagePlan": {
            "idleMessageMaxSpokenCount": 2,
            "idleTimeoutSeconds": 5
        }
p
@barbelos can you try one call with the following startSpeakingPlan and let me know how it goes for you..
Copy code
"startSpeakingPlan": {
      "waitSeconds": 0.6,
      "smartEndpointingEnabled": false,
    }
b
@Shubham Bajaj Same issue Call IDs
13551297-b0bd-4df8-a240-c0e3f2c02b2b
8a0a51ea-6e34-4f2a-acb5-2251ec1fd7ea
s
After analyzing your call logs, I've traced the sequence of events that occurred when your Arabic-speaking user attempted to interact with the assistant: 1. **Initial User Speech Detection**:
Copy code
VAD Detected: Speech Possibly Starting...
   VAD Detected: Speech Started
The Voice Activity Detection (VAD) successfully detected that the user started speaking. 2. **Language Detection**:
Copy code
Language detected vi with confidence 0.06495575
The system incorrectly identified the Arabic speech as Vietnamese with 0.06495575 confidence (just below the 0.07 threshold) 3. **Speech Stopping Detection**:
Copy code
VAD Detected: Speech Possibly Stopping...
   VAD Detected: Speech Stopped
The VAD correctly detected that the user stopped speaking. 4. **Endpointing Handling**:
Copy code
Endpointing Ignoring VAD Speech Detected. No Active Timeout.
This is where the issue occurred. Despite detecting speech start/stop, the system didn't properly handle the endpointing. ## Root Cause The issue stems from multiple factors: 1. **Language Detection Mismatch**: - Configuration specifies Arabic:
"transcriber": {"language": "ar-SA", "provider": "azure"}
- System detected Vietnamese with 0.06495575 confidence (below 0.07 threshold) - This caused confusion in language processing 2. **Endpointing Configuration Issue**: -
"smartEndpointingEnabled": false
means using basic heuristic endpointing - This basic endpointing doesn't handle Arabic speech patterns effectively 3. **Transcription Provider Settings**: - Using Azure Speech Service with Arabic configuration - But the speech detection confusion caused the pipeline to break down ## Recommended Solutions ### Solution 1: Try LiveKit Endpointing with Adjusted Wait Time Update your assistant configuration to:
Copy code
json
"transcriber": {
  "language": "ar-SA", 
  "provider": "azure",
  "codeSwitchingEnabled": false
},
"startSpeakingPlan": {
  "waitSeconds": 0.8,
  "smartEndpointingEnabled": "livekit"
}
This configuration: - Disables language code switching to prevent the Vietnamese misdetection - Enables LiveKit's smart endpointing which may handle Arabic better - Increases waitSeconds to give more time for proper speech detection ### Solution 2: If LiveKit Doesn't Work, Try Deepgram If Solution 1 doesn't resolve the issue, switch to Deepgram:
Copy code
json
"transcriber": {
  "language": "ar", 
  "provider": "deepgram",
  "model": "nova-2",
  "codeSwitchingEnabled": false
},
"startSpeakingPlan": {
  "waitSeconds": 0.8,
  "smartEndpointingEnabled": false
}
Deepgram provides strong support for Arabic and may handle the speech patterns better than Azure in this case. ### Solution 3: Adjust VAD Parameters for Better Detection If you prefer to stay with Azure, try adjusting the VAD parameters:
Copy code
json
"transcriber": {
  "language": "ar-SA", 
  "provider": "azure",
  "codeSwitchingEnabled": false
},
"startSpeakingPlan": {
  "waitSeconds": 0.8,
  "smartEndpointingEnabled": false
},
"stopSpeakingPlan": {
  "voiceSeconds": 0.3,
  "silenceSeconds": 1.2
}
This configuration: - Adjusts the stopSpeakingPlan to be more sensitive to Arabic speech patterns - Increases silence detection time to better handle pauses in Arabic speech ## Next Steps 1. Try Solution 1 first (LiveKit with waitSeconds=0.8) 2. If that doesn't work, try either Solution 2 (Deepgram) or Solution 3 (adjusted VAD)
b
I've tried solution 1, and I get the following error
Copy code
"assistant.transcriber.property codeSwitchingEnabled should not exist"
Also solution 2 doesn't work because I get an error that deepgram does not support
ar
When using deepgram provider, error is as follows
Copy code
"assistant.transcriber.language must be one of the following values for nova-2 model: bg, ca, zh, zh-CN, zh-HK, zh-Hans, zh-TW, zh-Hant, cs, da, da-DK, nl, en, en-US, en-AU, en-GB, en-NZ, en-IN, et, fi, nl-BE, fr, fr-CA, de, de-CH, el, hi, hu, id, it, ja, ko, ko-KR, lv, lt, ms, multi, no, pl, pt, pt-BR, ro, ru, sk, es, es-419, sv, sv-SE, th, th-TH, tr, uk, vi"
Solution 3 will not work either because it seems
codeSwitchingEnabled
is not available for azure transcription through your endpoint at least!
s
Sorry, it's my mistake. Code switching enabled is only for DeepGram, so it shouldn't be working with Azure. My next suggestion to you is to use your existing transcriber along with it. Use Start Speaking plan recommended changes. Give it a try, make a call, and let me know how it goes for you.
Copy code
json
"startSpeakingPlan": {
  "waitSeconds": 0.8,
  "smartEndpointingEnabled": false
}
@barbelos For your information, I am putting your ticket under investigation label so that I can even do a follow-up with you later. Let me know how this call goes for you post
startSpeakingPlan
changes suggested.
b
Hello @Shubham Bajaj This recommendation didn't work, unfortunately! What seems to be helping though is using a different transcriber that supports Arabic
Copy code
"transcriber": {
    "language": "ar",
    "model": "whisper",
    "provider": "talkscriber"
}
p
Hey Barbelos, Thanks for informing me about the outcome of the suggestions. As you've mentioned, you have to change the transcriber to Talkscriber. I hope it will work the way you're expecting it to be. Do let me know how it goes for you after making a few test calls.
b
Hey @Shubham Bajaj unfortunately it doesn't perform as well as Azure, which we were able to use
ar-SA
previously. Was there any recent update on the Azure transcriber on your side? Do you have any other recommendations that will allow us to use the Azure transcriber?
p
Hey Barbelos, We cannot make any changes on the Azure Transcriber as we are just an orchestration layer where we integrate the service providers. It's directly coming from the Azure Transcriber team itself. I would suggest you look for transcribers that have good support for the Arabic language as you require. If they support streaming, you can use them with VAPI as a custom transcriber. To send them out, just let me know, and I'll share the further process with you. As of now, most transcribers are optimized for the English language, so they sometimes fail on non-English languages to perform exceptionably good.
b
I was referring to any configuration changes/defaults with Azure Transcriber Okay, feel free to close this thread while we look for a custom Arabic-focused transcriber
p
by default, we don't pass anything extra to the Azure transcriber config or any other transcriber configs. If we expose any parameters and you use them, then we pass as it is to the Azure. So yes, you are secured on this part.
Marking this ticket as Solved ✅
2 Views