Latency Increase
# support
f
I have several assistants based on GPT-4.1 Mini, and I've noticed that since this afternoon, the model's latency has increased from 400ms to 770ms. Why is that? The latency of almost all models has been increased significantly: what happened? My assistants went from 900ms of total latency time to 1370ms!
v
Hey fillipio, I couldn't find enough details to fully answer your question, but I've escalated it to a human agent who can help you out!
Hey fillipio, I couldn't find enough details to fully answer your question, but I've escalated it to a human agent who can help you out!
k
To reduce GPT-4.1 Mini latency, shorten prompts, enable streaming if possible..
f
Ragionato per un paio di secondi We’ve been talking about assistants that have been in production for weeks, and no changes have been made to them. I don’t think it’s a prompt issue, but rather an internal one, and as far as I know OpenAI hasn’t modified the latencies of its models.
s
I've noticed the same thing.
p
OpenAI models were never stable in their response times, bigger overall load - longer response times. That's why I never used them for voice AI myself. Just ran my test that does 50 requests to each model/provider and openai has largest stdev https://cdn.discordapp.com/attachments/1369433726294949888/1369569790603034714/image.png?ex=681c56c3&is=681b0543&hm=7461f991d0d522c246b435bff25d828d8daff80c2202b2b26fa8d61a74b471a0&
k
Hey Fillipio, checking if this is resolved for you.