OpenAI STT
# support
s
The new openai STT is not usable at all, the transcribe is way off. Here is the example: ID: 82723087-c6b1-4dd2-9e46-9e6ed487d66f https://cdn.discordapp.com/attachments/1353832849316515902/1353832849723232346/image.png?ex=67e31698&is=67e1c518&hm=33557842f8e6c9c6d39c1897eca142f7503d248b11a4379ba7e6544ab0c4200a&
v
Hi there! 👋 I understand your frustration with the OpenAI STT transcription accuracy. I'm unable to look up the specific call ID you've shared, but I can definitely help with the OpenAI STT transcription issues you're experiencing. The example you provided where the AI was transcribed as "I'm your voz fun assistant" is a clear indication of the transcription problems. According to multiple user reports, the new OpenAI STT has indeed been showing significant transcription inaccuracies. This is a known issue that has been reported by several users who have found that ["the new OpenAI STT is not usable at all, the transcribe is way off"](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1327481613004701779). Several factors can affect OpenAI STT performance: - Model issues with specific models like
gpt-4o-realtime-preview
- Timeout constraints causing fallbacks to less accurate models - Configuration issues with language settings - Speech endpointing problems (detecting start/end of speech) Here are some troubleshooting steps you might try: 1. Consider switching to the GPT-4o cluster instead of the realtime cluster, as [the realtime cluster is considered unstable by OpenAI](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1337465970557452299) 2. Adjust endpointing settings to give more time for speech processing 3. Check your audio quality and environment settings If you're still experiencing issues, many users have found success by switching to alternative STT providers such as: - Deepgram, which offers [high accuracy transcription using end-to-end deep learning models](https://docs.vapi.ai/providers/transcriber/deepgram) - Talkscriber, which boasts [industry-leading transcription accuracy with a Word Error Rate under 4%](https://docs.vapi.ai/providers/transcriber/talkscriber)
- AssemblyAI, offering [up to 95% accuracy and 30% less hallucinations than other leaders in the space](https://docs.vapi.ai/providers/transcriber/assembly-ai) Would you like more specific information about implementing any of these alternative STT solutions? Sources: - [Discord Thread about OpenAI STT issues](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1327481613004701779) - [Discord Thread about GPT-4o vs realtime cluster](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1337465970557452299) - [Deepgram Documentation](https://docs.vapi.ai/providers/transcriber/deepgram) - [Talkscriber Documentation](https://docs.vapi.ai/providers/transcriber/talkscriber) - [AssemblyAI Documentation](https://docs.vapi.ai/providers/transcriber/assembly-ai)
s
I am referring the gpt-4o-mini-transcribe gpt-4o-transcribe
@User do you encounter the same issue as well?
v
Hey! To help track down this issue, could you share: - When exactly this happened (the timestamp) - What response you expected to get - What response you actually got instead This would really help us figure out what went wrong!
s
v
This smoothie can give another call a try using deepgram. There is definitely some issue with my OPEN AI's STT as of now. The deepgram transcriber will give you expected output as they are best in market.
s
yeah, I were just trying to some testing on new models. and im not so sure if deepgram is the best or not, I were using other tool to do transcription of a meeting audio file, 11labs and gpt-4o-mini are way better than deepgram
v
Okay, in case you need further help, please let me know.