### Answer
The behavior you're experiencing with your agent when using Twilio, as opposed to its performance on the web dashboard, might be attributed to differences in how audio is processed in web environments versus over phone lines. One known issue is that the performance of Speech-to-Text (STT) models, like Nova2, can vary significantly between these environments due to differences in audio quality and transmission methods [(1)](
) .
For English, Deepgram offers a fine-tuned model specifically for phone calls, which might not be the case for all languages or models. Since you're using Nova2 from Deepgram, it's possible that the model's performance is optimized for web audio rather than the compressed audio formats used in phone systems. This could explain why the agent repeats or ignores interactions - it might be struggling with accurately transcribing the audio from Twilio phone calls.
A possible solution mentioned in the forums is to try different STT providers or models that might have better performance over phone calls. For instance, switching to a different STT engine like Whisper might yield different results. However, it's crucial to experiment with different configurations to find the one that best suits your needs.
Additionally, ensure your Twilio integration settings and the way you're handling audio streams align with Vapi's best practices. Misconfigurations could also lead to issues like the ones you're experiencing.
If you continue to face challenges, considering the specific use case and the detailed behavior you described, directly reaching out to Vapi's support with the call IDs and detailed explanation could also provide more tailored assistance.