Transcriber works well on web but not with twilio
# support
I have created a vapi agent, that is in spanish, it uses deepgram for transcribing into text, azure/11labs for voice, and groq/gpt for the model, right now this works perfectly when being tested on web, as well as the transcripts read perfectly when seen in the "calls" section. However, when doing the same call, with the exact same settings but on a phone call, the bot acts super weird, it interjects the user mid sentence, randomly stops talking, repeats things, ignores the users message. And this gets reflected when looking at the transcripts. The transcripts look something like this (note this are long calls with various messages exchanged by the user) - USER: Cutted off message of the user - AI: Predefined first message - USER: Continuation of the cutted off message of the user, followed up by all the other words the user said in the call, all stitched together in the same message Thats it, no other words are recorded in the logs from the ai, apart from the predefined first message. (even though it does say some things, just cutted off, badly, and off context, and it almost always repeats stuff) Here are some ids of some calls: 1d32a5e3-2251-4621-af63-e0da4bf98ec7 2de0dd70-d1e9-49bc-87cb-667ed5ef491d ca54db9d-065a-45ef-8241-fed4798f9d40 Note that i have tried this with 3 different phone numbers, in a quiet setting, and all of them produce the same result Any idea how this can be fixed?
Hey @flamaso_ , Thanks for sharing the issue with us. Would you mind showing me 2-3 examples of actual transcripts with a phone call and one example of a normal web call. Just a small example of conversation would be great for better context for us
Sure thing, here is a normal web transcript ID: 24ec19ec-2524-4af4-b058-bbcf02efe9db AI: Hola, panadería Juancito, mi nombre es Maxi. ¿En qué puedo ayudarte? User: Hola, Maxi. ¿Qué tienen para ofrecer? AI: Claro, te cuento lo que tenemos en stock y sus precios. Facturas, trescientos cincuenta pesos los cien gramos. Creyelitos, cuarenta pesos los cien gramos. Si tenés alguna otra consulta o necesitas más información, no dudes en preguntar. User: ¿Y cómo? Bueno, ¿las facturas están recién salidas del horno? AI: Sí, lamentablemente, no puedo confirmarte Ya que soy una asistente virtual, pero en nuestra panadería siempre procuramos tener productos frescos Si querés saber más detalle, User: Bueno, puedo ir a buscar la ahora a las cinco de la tarde? AI: Sí, podéis pasar a buscarlas a las cinco de la tarde. Nuestro horario de cierre es a las seis, así que estás a tiempo. Te esperamos, Sí, User: Bueno, Maxi, gracias. Nos vemos. AI: De nada, Sí. Nos vemos a las cinco entonces. Hasta luego. And here are some examples on phone ID: 1d32a5e3-2251-4621-af63-e0da4bf98ec7 User: Hola, Maxi, ¿a qué hora cierran aproximadamente? AI: Hola Panadería Juancito, mi nombre User: Bueno, me gustaría comprar un kilo de facturas. Sí, me gustaría un kilo de facturas. Me gustaría comprar mil gramos. Me gustaría comprar cien gramos. Me gustaría comprar trescientos gramos. Me gustaría comprar quinientos gramos. Me gustaría llevar trescientos cincuenta pesos. (All these were separate messages, that were attempted to be answered by the ai, however those failed answers do not appear on the audio file) example 2: ID: 3b99fa28-13a9-42cf-95c6-176e0edad7e3 User: Hola, Maxi, ¿cómo estás? Sí, quería averiguar qué AI: Hola, panadería Juancito. User: ofrecen en la panadería. Quería pedir dos kilos de criollitos Claro, claro, pero yo quiero llevar dos kilos de críositos. De la panadería. Sí, me parece perfecto, AI: Dale genial. User: Aquí, ¿a qué hora puedo retirarlo, criatito?
Adittional notes: what seems to happen, is that the whole message history gets resent on each response by vapi. So if i first ask user: What is 2+2? ai: That is 4 //and then i ask user: What is the capital of russia? ai: 2+2 is 4, the capital of russia is moscow Here is another example of a failed phone call ID : 75d6c7d6-8cc2-4cb8-ae26-6daab090621e AI: Hola, Panadería Juancito, mi nombre. User: Hola, Max. ¿A qué hora cierran hoy? Quiero comprar cuatro facturas. Sí, quiero comprar, ¿cuánto me cuánto me saldría? Forwarded → 0351 510-9822
Hi @flamaso_ , I will forward the call IDs to my team. I will let you know once we have a fix or know why this could be happening
ah yup, this is because:
Copy code
  "model": "nova-2",
  "language": "es",
  "provider": "deepgram"
for english, deepgram has a finetuned model for phone calling that works well
you could try talkscriber whisper and see if that works better
I have just tried with whisper, now the pattern seems to be: user: What is 2+2? ai: That is 4 //and then i ask user: What is the capital of russia? ai: That is 4 Any idea if this can be fixed?
or i guess i won't be able to read it
can you see if you can reproduce in english with it
Google translate
This is the best audio i could get, because it kept transfering the call, even though im using gpt-4, but here you can see the problem persists in english. ID: 82126d6f-a58c-420c-92b4-f45e60ade7f9 Also, here you can see clearly the difference between the recorded call on the vapi dashboard, (its missing all the bot audio responses except the first one) and the recorded call on my phone directly.
Any updates regarding this?
Hey @flamaso_ Did you import your phone number into twilio by using the import number button?
Let me check
Yes i did
Thanks for confirming that. I think it maybe happening with lack of proper handling of twilio audio. We are following up with twilio to make sure we get to the root of this
Great! will wait for updates, thank you
Hey @flamaso_ Would you be able to try again now? The problem would still be there probably, but we added logs on our side to print some information about the twilio audio. Just try a phone call again and let me know
Sure thing, here is the id: 7cb20c4a-8ea6-4854-ada7-24461b61647b @sagar_68678_33523
also are you using anything like skype in between to answer your phone call? Or is it just a normal phone call that you receive?
I am just using a normal phone
I have tried with various different ones
@sagar_68678_33523 I've came across this post exactly because I'm experiencing similar things with Skype. It is not handling very well some calls, it's not following the prompt sometimes. And this only happens with Skype kkk Any clue of what may be happening?
Anyone experiencing some strange stuff with Skype?
I have exactly the same issue here, works great in the web dashboard but all over the place via Twilio
Any ideas or resolutions?
I will tell you few things which will significantly improve the performance: 1. Use Vonage 2. Use nova2-phonecall (only eng lish is supported) stt model
Thanks, unfortunately I need to use Twilio for the region I am in
Will try the nova2-phonecall transcriber though
@sagar_68678_33523 any thoughts on the above?