Assistant Still Using gpt-3.5-turbo Despite fallba...
# support
r
Hi Vapi team, We're experiencing an issue where our assistant is making calls to gpt-3.5-turbo even though it is explicitly configured to use gpt-4o and we've set fallbackModels to an empty array to prevent fallbacks. Details: - We have confirmed the assistant is configured with: { "model": { "provider": "openai", "model": "gpt-4o", "fallbackModels": [] } } - This configuration was applied using the Vapi API via a PATCH request. - Despite this, we still observe gpt-3.5-turbo being used in the call logs. - The issue occurs whether we use our own OpenAI key or the Vapi-provided one. - When using our own key, we can also see gpt-3.5-turbo usage in our OpenAI usage dashboard, confirming that the fallback model is still being called somewhere in the process. Could you please help us investigate why this fallback is occurring despite the correct configuration? Let us know if you need any assistant IDs or example timestamps from the logs. Thank you!
v
According to Vapi's documentation, even when
fallbackModels
is set to an empty array, the system will still "look for the current model's replicate and get all models of the same type" ([Discord source](https://discord.com/channels/1211482211119796234/1323288623629471866)). This behavior is by design, though it may not be what you expect. Several factors could trigger this fallback: - Your OpenAI key might lack access to gpt-4o - The request might exceed maximum payload size - The gpt-4o model might be experiencing instability Try checking your call logs for specific error messages that might indicate which of these issues is occurring. Source: - [Discord source](https://discord.com/channels/1211482211119796234/1323288623629471866)
r
@Vapi Support Bot Could you please elaborate further on "look for the current model's replicate and get all models of the same type"
v
Based on the information I found, Vapi categorizes models into types like "Large Language Models" (which includes OpenAI models), and may consider different versions or providers as "same type" models. To better understand your specific situation with gpt-4o falling back to gpt-3.5-turbo: Could you share any error messages from your call logs that appear when these fallbacks occur? This would help identify which of the potential triggers (key access issues, payload size, or model instability) might be causing the fallback despite your empty fallbackModels array. Source: - [Discord source](https://discord.com/channels/1211482211119796234/1323288623629471866)
s
Hey! To help track down this issue, could you share: - The call ID This would really help us figure out what went wrong!
r
@Shubham Bajaj Here's my call ID: 3ec07cc0-f972-441b-9601-3a607ea343df. In the call logs, you can see it reverts to gpt-35-turbo-1106.
I've noticed as well in the UI and when I get request to the vapi asssistant ID, that both models are set to llama-3.3-70b-versatile but in the call logs it will still default to Model request started (attempt #1, gpt-35-turbo-1106, azure-openai, westus)
Another thing I've noticed is that if it is a webcall then the correct chatgpt is used if i have it set to 4-1, but when I call over the phone then it reverts always to 35-turbo.
@User Can any you help me wit the issue I have stated so far in this thread? I feel as though this maybe an account specific bug/issue.
s
@RussianRambo Hey, you have set your model to GPT-3.5 Turbo, not LLaM 3.37B Versatile, so that's why requests are going to the 3.5 Turbo model https://cdn.discordapp.com/attachments/1364845338468548710/1365014471332073494/image.png?ex=680bc44a&is=680a72ca&hm=91cdcee7e2deb0250090ae541a1c23d4e1d655ecb54501f94681ed3cbeb73ed1&
r
@Shubham Bajaj That's strange because on vapi ui it shows gpt-4.1 and when I did a get request through postman it shows model gpt-4.1. I don't even see the option to choose gpt-3.5-turbo as a model in the vapi ui
I also noticed when I do a webcall, it shows the correct model I chose in the vapi ui
s
can you share the recent call ID?
k
@RussianRambo
3 Views