Connection VAPI with Azure OpenAI #2
# support
b
I'm experiencing this same issue on useast2 as @Mario did. I also need swedencentral for my implementation but I read in the post link below that some new models arent available in swedencentral yet so I decided to test with eastus2 while waiting, but getting the same thing. I struggled for a day to get Azure to connect under providers, and eventually found that I had to remove the version number "https://.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version" and If I added the OCP-APIM-Subscription-Key Header it will not connect. I had to leave OCP-APIM-Subscription-Key blank and only add the end-point url and api-key in the Azure OpenAI provider settings for it to work. Could this be the problem? I havent used a knowledge base yet. like @Mario has. I also read the error "An error occurred in this call: pipeline-error-azure-openai-llm-failed" which I'm getting in the logs could be related to an incorrect api-version. How does Vapi know what the api-version is if its not given in the provider settings? I can get the call to work in Postman with the api-version=2025-01-01-preview I get the same error "Exiting meeting because the room was deleted ", when I start the call I get the first message, I respond but get nothing back from Azure eventually erroring out after 30-40 seconds. I'm trying to narrow down the problem, and looking at these three possiblities: - eastus2 also has models that arent supported like swedencentral? - removing api-version"=2025-01-01-preview" from the endpoint-url works to connect but how do I check which api-version Vapi is using calling Azure? - The only way I can connect the provider is by leaving the OCP-APIM-Subscription-Key blank, is this fine? Adding the subscription-key causes 401's and I have checked the subscription keys are correct, so not sure why I'm getting that error. https://discord.com/channels/1211482211119796234/1343704609096601693/1343704609096601693
Hi @Mario, have you been succesfull getting Azure to work?
m
Nope. 😔 I think the only solution is VAPI providing a finished solution for EU data protection restrictions.
k
We use the official Azure OpenAI SDK which handles API versioning internally. This SDK automatically selects an appropriate API version, which is why you don't need to specify it in the endpoint URL. When you manually configure Azure OpenAI in Postman with api-version=2025-01-01-preview, you're explicitly setting a version. However, when integrating through Vapi, the SDK manages this for you. Your issue is that you're trying to use a model that's not yet available in eastus2 or swedencentral. \- gpt-4o-mini-2024-07-18 is explicitly defined for eastus2 in the code \-But newer models like gpt-4o-2024-08-06 are only defined for westus Yes, it's completely fine, to leave OCP-APIM-Subscription-Key blank. For now you can try using westus or westus3 regions for gpt-4o-2024-08-06 https://cdn.discordapp.com/attachments/1348382655762268200/1349001104234647612/Screenshot_2025-03-11_at_6.12.23_PM.png?ex=67d182ae&is=67d0312e&hm=56ea1f5b02acfd987f012ec5076305776d78991dc52b35a8fa3d25e0e7601f0d&
@BAS014 and @Mario, our team is at full capacity and may take longer to complete this task. Please use your own key, and it should work for you. Let me know if you need further help or have any confusion.
b
Hi, thanks for the reply. Yeah I got it to work on westus3. thanks. keep us updated on any progress for the UK or euro region for any new models.
w
thanks dude !!!
b
Hi @Shubham Bajaj Sorry to raise this again, but any progress on having gpt-4o-2024-08-06 in a EU and UK region yet? I'm using westus3 at the moment, but the latency is massive, sometimes even timing out. I hope a closer region that Vapi supports will help this. Not sure. If I test with openAI directly my response is almost instant. Also we cant go to prod without having Vapi support in EU or UK unfortunately. @Mario If you are testing a US region with Azure what is your latency like? Thanks
o
A feature request was raised here, have a look and feel free to upvote. https://discord.com/channels/1211482211119796234/1343704609096601693/1351439894995800175
w
Yeah I thought that it was fixed. it helps to accept but definetly unuseable when calling..
Did you find an alternative ?
On this since 4-5 days noww.. complitely stuck..
b
Hey @wizzy, sorry late reply. I'm going to have to wait for the Vapi support in EU to see if the latency issue is better, but I'm starting to think this cant be so slow if I'm in the UK and model is in westus3. This might be that Azure/Vapi is not optimal... I cant go to Prod anyway until Vapi supports EU, but time is tight... Timeout is 20 seconds... I cant see that as being a latency issue at the moment. If I call my Azure model directly from Postman, the response time is about 700ms, but with Vapi it times-out at 20 seconds westus3 with Azure on Vapi works, but only like 40% of the time, the rest times out. I've only got westus3 to work. @Shubham Bajaj if you speak to your team, could they give an indication of where the bottleneck is? Besides project time pressure, I dont want to wait for EU support and then find we still have slow response times. This is a deciding factor for us... Let me know if I should open a new ticket..
s
@BAS014 @wizzy @Oliver @Mario Guys, I suggest you to upload the following ticket, and then from there onward, I will try to push it as possible. https://vapi.canny.io/feature-requests/p/support-for-azure-openai-in-the-eu-gpt-4o-2024-11-20-in-sweden-central-expanded
@BAS014 for your latency issue, can you share the call ID? I'll take a look and see what's going on.
m
Already did. 😊
b
Hi @Shubham Bajaj, sorry, late reply. This is an example of an OpenAI Azure with GPT-4o-24-08-06 in westus3 96eaebcd-5481-4261-bc17-88f0725cbb2f Using openAI direct is pretty much instant. Let me know if there's anything I should change or check for you.
Hi @Shubham Bajaj hope you good. Did you find anything regarding the latency on that Call-id?
k
Sorry I was out of office yesterday, so couldn't reply to you earlier. Now I'm starting work on it.
Hey, Bass014. You are using Chat GPT 40 latest model, and it doesn't support tools calling. When the assistant turns come, it goes to this model which eventually fails or errors out and then it goes to the 4o variant. What happens is when you shift to the 4.0 latest, 4.0 variant model, sometimes It adds up the latency as it fails to generate the response in the required time which results. So that's why you observe perceived latency in your call. I will suggest you: 1. Use a different GPT 4o model 2. Try with our OpenAI key . Your latency issue has been resolved.
let me know how it goes for you.
b
Hi @Shubham Bajaj, appologies for the very late response, I had to divert my attention. I'm back.. Thanks for the reply, Yeah, that makes sense. I have changed to the GPT 4o model and it seems to be getting further, my first tool call is made and the assistant relays the result to the user, however after that it times out. "Model request taking too long (attempt #1, started 20000ms ago)". Before when using GPT4o latest it timed out mostly before the first tool call. I am still on westus3, not sure on the progress of support for GPT4o 2024-08-06 yet in Sweden... I see in point two you mention use our OpenAI key. Are you refering to an Azure openAi key of yours? I need to use Azure because of GDPR here in EU regions. Using OpenAI directly works well. So I'm still not 100% sure where the issue could be. Here are two of my Call Ids: 7c582d38-8bfb-44ba-a35b-c9d0937d6981 6578f344-d2fa-4b9e-8bbd-375dd18f1834 Thanks @Shubham Bajaj
k
This ticket has been marked as solved, and as a result, you will no longer receive any further responses. Kindly create a new support ticket to escalate your inquiry. 
b
Yeah I would also like to know
s
Hey guys @BAS014 @Oliver , can you open a new feature request and then share it with me? I'll make sure that this gets to the PROD..
9 Views