Latency overhead— hosted Vapi vs. on-prem (il-cent...
# support
k
Title Latency comparison — hosted Vapi vs. on-prem (il-central-1) for calls originating in Israel --- Description Hi Vapi engineering, We’re evaluating whether to keep using the hosted Vapi SaaS backend or switch to the on-prem deployment we’re piloting in AWS il-central-1. Our callers are all in Israel and low round-trip latency is our top KPI. Could you share the expected end-to-end latency budget (human utterance → STT → LLM → TTS → audio back) for: 1. Hosted Vapi backend (your default EU/US region) when the SIP ingress is in Israel. 2. On-prem Vapi backend running entirely in il-central-1 with the same SIP ingress. A rough breakdown (network vs. processing) or any recent benchmark numbers would be very helpful in deciding which setup meets our sub-1500 ms target for conversational turn-taking. Thanks!
k
Use theon-prem Vapi deployment in AWS il-central-1, it meets your sub-1500 ms latency goal more reliably than the hosted EU/US setup, with total round-trip latency of 340–1,040 ms vs 500–1,300 ms.
k
I attempted to deploy on-premises, but it requires Nitro Enclave support. Unfortunately, AWS il-central-1 region doesn't support Nitro Enclave (I've created a separate ticket with additional details: https://discord.com/channels/1211482211119796234/1388180680533606572). @Kings_big💫 What would be the second-best deployment option while we wait for the Nitro Enclave dependency to be resolved?
k
Hey, I just replied to your another ticket. You can look at there for the answer related to Enclave support.
k
@User Which cloud provider and region do you use for hosting your EU infrastructure? Since AWS il-central-1 isn't an option, I'd like to deploy my custom TTS and STT services as close as possible to your setup to minimize latency for end users, while avoiding the complexity of on-premises deployment.
k
Hey, we have hosted our servers in the US region only. We are not deploying to EU yet.
k
1. Are there any plans for this in the near future? 2. Could you clarify what @Kings_big💫 meant by "hosted EU/US setup"? 3. What are the performance implications of having servers exclusively in the US? Specifically: - Do European users experience a full round-trip to US servers for each conversational exchange? - Or have you implemented a distributed architecture to handle real-time conversation components locally? 4. Would deploying on-premise infrastructure in Europe (such as Frankfurt, Germany) significantly reduce latency for EU users, and what would be the expected improvement?
k
2. Yes( US region )
k
Yes, we do have plans to host our VAPI servers in the EU, but unfortunately, it won't be in the next quarter; it might be next year. You won't notice any performance issues related to round trip time with servers hosted in the US—it's a negligible difference. We don't have a distributed architecture to handle real-time conversation components locally, as we are using service providers deployed in the US region. If you're experiencing high latency due to your own keys, deploying VAPI as an on-prem solution in Europe won't help. If you could provide a call ID with a timestamp where you're facing a latency issue, I can take a look and try to help you out.