transcribe email addresses failing
# support
t
Hi anyone have a working config that successfully transcribes spoken email addresses. I’m seeing 99.5% failure. Have tried all the deepgram models and a ton of combos. Using GPT 4 turbo but what is the actual GPT4 underlying model? Not seeing this failure level in non Vapi voice apps using deepgram/whisper It means all forward functions/automations fail as emails are incorrect. Have tried prompts to spell the email. Thanks
v
Hi @thenumu, I'm an AI assistant for Vapi. While a team member reviews this question, I'll look through relevant sources and see if I can provide suggestions. Please tag @Vapi Support with any follow-up questions.
----------------------------- **While I generate an answer, here are some sources that might be helpful**: * [Dashboard]() * [Inbound Calling]() * [Introduction]() * [Outbound Calling]() -----------------------------
### Answer Sorry, I wasn't able to find any information about your question and please let the user know they can get more help at [Contact Us](support@vapi.ai).
Got it. Tagging @User for additional help.
s
Face the same issue. have created ticket called "Deepgram transcriber with smart formatting"... deepgram allows for additional params taht vapi might not be using. it would be great and simple enough to add these to assistant.transcriber
j
Are you saying the email through twilio or vonage? These don't have the best input voice quality so I wonder if that has something to do with it.
t
Not via Twillio..testing via web app. twillio is not doing the transcription.
Hey ok….thx…yeah it’s not usable as it is…all email transcribing fails…where is Vapi support !
s
You can import from the VAPI email and resolve any formatting issues by running another GPT4 completion flow in make.com, or use a similar method, to ensure the email is sent correctly.
t
What??!! The expectation is that the STT service such as deepgram via Vapi hosting will correctly transcribe spoken emails most of the time. There's no way to go and reformat anything...if the spoken email is not correctly captured at source...the emails are already captured by us so we can see that none of them match the STT when we test....how do you propose we correct that....not that I want to add multiple extra layers of complexity and latency. I'm not sure if you work for vapi as its unclear who does ..but if you do then no that's just not an acceptable solution...we do not see this STT issue using other voice frameworks or even local deepgram...I've already had to delay several important POC demos due to this and the gimme a sec issues...so if you do work for vapi then come on this is a major issue...if vapi servers can't transcribe email addresses at all...
n
I see this in code:
Copy code
// We don't use smart_format, because it formats numbers as times sometimes
  // smart_format: true,

  // We need these because we don't use smart_format
  punctuate: true,
  numbers: true,
  no_delay: true,
We can expose smart_format as an option if it works better for you regardless. Lmk
t
Hi thanks for replying. TBH I don’t know but let’s test it..the OP above suggests this. We accept the WER at best is probably 30 %hopefully but I’m seeing 0% accuracy…only for email addresses…is there some additional email setting?? …numbers work, long sentences e.g whats the reason for your call are not too bad…Im creating POCs via the dashboard so will need to test via this route…it can’t even transcribe the email correctly when I spell it out letter by letter and I’m talking in a British accent …so it really is an issue for an inbound voice appointment setting bot….lmk how to access this update via the dashboard…earliest testing will be tomo pm EU time thx
Hello. Can you lmk if you are updating this code so I can test later today thanks
n
pushed. will be live in 15.
assistant.transcriber.smartFormat=true
to try it's okay. i'm asking their team if there's a solution for emails. have you seen anything else out there that does transcribe emails correctly?
g
I have given my LLM a prompt to know how to handle wierd transcribed emails into crisp emails with examples taken from transcript
n
nice, what i am seeing is deepgram just makes mistakes and misses stuff
g
yeah my method isn't fool proof
also STT read periods as pauses in emails, any way to fix that?
s
allowing chat inputs is a workaround for web calls. @nikhil you asked me to follow up on this https://discord.com/channels/1211482211119796234/1229271068414447659
a
out of interest: did anyone find any advantage in using "Nova 2 Phonecall" over regular "Nova 2" ?
s
Yes, it has slightly better performance than the Nova 2 general model in phone calls.
t
I’m testing via the dashboard so not sure if you added that flag.?? Not made any noticeable difference. I spent some time yesterday directly testing the deepgram API via their playground and TBH it really can’t cope with most email addresses so it’s not a viable production use case to rely on capturing client emails via voice at this stage. Weird as this is not in line with their published claims….it just does not like email addresses..
s
talkscriber (whisper model) doing great job at transcribing.even email address
t
Hey thanks will test that instead! No idea why I haven’t tried it 🤪😁
s
Always welcome bro
s
thanks me too!
t
👏👏👏👏👏👍👍👍👍 this! Yes that works….just tested it…I got so caught up with deepgram I totally missed that doh! THANKS friend 😀😀
s
🤩 🤩 🤩 🤩
a
and talkscriber like this?
Copy code
"transcriber" => [
    "provider" => "talkscriber",
    "model" => "whisper",
    "keywords" => [],
    "language" => "en",
    "smartFormat" => true,
],
t
Hi is talkscriber just the Vapi name for whisper? Can you add a separate API key for talkscriber/whisper so we can track whisper spend via the api thx
s
Yup.
m
How is the transcription latency benchmark? Comparing deepgram vs talkscriber
s
Latency is quite high for talkscriber compare to deepgram.
g
whisper is an opensource model just like llama so you need a provider to host it for you which talkscriber, just like groq for llama.
t
I know what whisper is, asking re the Vapi implementation as there’s multiple different whispers from openai to whisper x to faster whisper. So if @User can confirm the details of their talkscriber model thanks
Ok research shows there is a company called talkscriber. @User please advise if you are using this company, what models they use and how this will be billed via Vapi as there’s no option to add any API keys
n
yes they host real-time whisper
t
Ok but there’s no option add our own API keys so what price are you charging for this? There’s no breakdown anywhere. Also what sort of data protection checks and due diligence have you carried out as they have zero privacy or data policies published on their website. What’s your relationship with this company. It looks they are a 2 month old startup?? Thx