AI Provider Latency Mysteriously Increased Vapi AI #support

AI Provider Latency Mysteriously Increased

LongBeard

05/07/2025, 11:16 PM

The latency for all AI providers have increased and affected my agents. Please see attachment. It appears that the listed 1000ms is not reflected in the breakdown, which shows as 2000ms. My agents have gone from 1200ms to 2600ms without any changes made. https://cdn.discordapp.com/attachments/1369815265977700473/1369815266527150080/Vapi_1.png?ex=681d3b61&is=681be9e1&hm=a3ee4101edcc9ca0d583ca0590d22c9d19565234df2a3ed2674da99ddaa2f7ca&

Vapi Support Bot

05/07/2025, 11:18 PM

## Check Network Routing Configuration Your latency increase might be due to traffic not being routed through the nearest data center. Contact Vapi support to verify your traffic is using the optimal regional endpoint for your location, as this can significantly impact response times.

Copy code

# Example support request
"I'm experiencing increased latency with claude-3-5-sonnet-20241022 (2600ms vs previous 1200ms). 
Could you check if my traffic is routing through the closest data center?"

Sources: - [Discord discussion about routing issues](https://discord.com/channels/1211482211119796234/1333747605628391434) - [Discord thread about Claude model issues](https://discord.com/channels/1211482211119796234/1331340140064407573)

Vapi Support Bot

05/07/2025, 11:18 PM

Copy code

# Example support request
"I'm experiencing increased latency with claude-3-5-sonnet-20241022 (2600ms vs previous 1200ms). 
Could you check if my traffic is routing through the closest data center?"

Vapi Support Bot

05/07/2025, 11:18 PM

Based on the information available, there are known issues with Anthropic Claude models in Vapi. Try these troubleshooting steps: 1. Check network connectivity between your system and API endpoints 2. Review your VAPI configuration, especially settings like:

Copy code

silenceTimeoutSeconds
   responseDelaySeconds
   llmRequestDelaySeconds

3. Contact Vapi support with your organization ID and detailed error logs if the issue persists Sources: - [Discord Channel](https://discord.com/channels/1211482211119796234/1332286268805087272) - [Discord Channel](https://discord.com/channels/1211482211119796234/1339232387279622154)

Vapi Support Bot

05/07/2025, 11:18 PM

Copy code

silenceTimeoutSeconds
   responseDelaySeconds
   llmRequestDelaySeconds

LongBeard

05/07/2025, 11:20 PM

latency in drop down is 1000ms while in the breakdown it shows me as 2000ms https://cdn.discordapp.com/attachments/1369815265977700473/1369816161730166794/Vapi_2.png?ex=681d3c36&is=681beab6&hm=0e980b1a48527301635b9ec5537d2f595e00bf2671e65ead8f055168b6669628&

LongBeard

05/07/2025, 11:22 PM

@User I'm experiencing increased latency with claude-3-5-sonnet-20241022 (2600ms vs previous 1200ms). Other models and providers affected as well. Could you check if my traffic is routing through the closest data center? Org ID: 0153f067-55e2-429a-8088-03a4f01ff2bf

LongBeard

05/07/2025, 11:49 PM

Can @User m Bajaj or @User look into this?

! AetherMind

05/08/2025, 6:29 AM

I have experienced with your issue.

! AetherMind

05/08/2025, 6:30 AM

But the clarify thing is latency of claude-3.5 is 2000ms

LongBeard

05/08/2025, 12:44 PM

Claude 3.5 used to be 1,000ms. It jumped to 2,000 overnight in the breakdown without me making any changes, also it still shows 1,000ms in the drop down menu.

Shubham Bajaj

05/08/2025, 12:57 PM

@LongBeard can you share your assistant ID?

LongBeard

05/08/2025, 1:23 PM

20bd47a4-e7ef-4357-b0b5-9f4832979b2b Altho it's all my assistants

Kings_big💫

05/09/2025, 8:04 AM

The discrepancy the your experiencing (1000ms in dropdown vs 2000ms in breakdown) is likely due to: 1. The dropdown menu may be displaying the default Anthropic latency (1000ms) instead of the specific model latency. 2. The breakdown view is correctly using the specific model latency (2000ms). This is a UI inconsistency, but the actual latency value in the codebase for Claude 3.5 Sonnet (Beta Computer Use) is correctly set to 2000ms. The increase from 500ms to 2000ms was a recent change, which appears to reflect actual performance measurements of the model.

Previous Next