OpenAI Realtime model voice issues over phone
# support
n
When I set up an assistant using the OpenAI realtime models, it works find in a web session, but when I call it the voice comes out at at very low speed and much lower in pitch than it should. Even with speed cranked up to 4, it's still abnormally slow. I've not yet found a setting that fixes it. The voice sounds normal on your own recordings of the call; it's only wrong as it goes over the wire out to phone. Problem exists both on Vapi numbers and on SIP calls from a Twilio number. With any other model, the OpenAI voices work just fine via telephone. Any suggestions?
v
Hi there! 👋 I understand you're experiencing issues with the OpenAI Realtime model's voice speed and pitch when used over the phone. While I don't have a guaranteed fix for this specific problem, I can definitely share what I know about this issue and some potential troubleshooting steps. Based on the information I've found, this is a known issue with the OpenAI Realtime model. Here's what might be happening: First, it's important to note that the OpenAI Realtime model is still in beta mode, so some hiccups are expected. According to [user discussions](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1336440478727667844), you're not alone in experiencing these issues! One key insight: the speed setting may not actually work with the Realtime model. As mentioned in [this thread](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1339276020389253181), "you cannot control the speed of Realtime model and these fields are left over there for backward comparability." This explains why adjusting the speed to 4 doesn't help. Some potential approaches to try: 1. **Consider using a different model for production**: Since the Realtime model is in beta, you might want to use GPT-4o for production and keep Realtime for development/testing until these issues are resolved. 2. **Check your call logs**: Look for any model switching during calls, which could affect voice characteristics. 3. **Review your configuration settings**: Make sure your assistant's language settings are properly configured, as missing language settings for the voice model have been identified as a potential issue. 4. **Try different voice providers**: Some users have had success switching to different voice providers when experiencing similar issues. Is there anything specific about these suggestions you'd like me to elaborate on, or do you have other questions about using the OpenAI Realtime model? Sources:
n
This is definitely not a problem with the model itself, as it sounds fine in web sessions and call recordings. This problem is somewhere where the audio is being passed into the telephony system.
k
Hey! To help track down this issue, could you share: - The call ID This would really help us figure out what went wrong!
n
Here's a call ID for a v short call I just placed:
b25ff755-6ed2-4336-8fb1-d0fef5c1a9dc
The call recording will sound normal, the phone call did not. But if you place an actual call to the number pointed at the assistant (phone number
2c6a34c5-9fa5-42f2-9adb-618b0e416bf0
) you should hear the problem.
My best guess is that it's a streaming problem. Sounds like the audio is playing to the PSTN at something like 2/3rds speed
k
Hey Nim, I called your number, and the voice quality is poor. It's a streaming issue through the OpenAI platform. I suggest you use other models for now. This OpenAI real-time model will improve over time.