Hungarian Agent
# support
k
We have an issue with the Hungarian language, specifically with the pronunciation of the word "vagyok" in the phrase: "Üdvözöljük a GIULIA-nál! Anna vagyok. Ez a hívás rögzítésre kerülhet. Segíthetek Önnek valamiben?" When played in 11Labs, it sounds like Recording - https://drive.google.com/file/d/1tc1f1IX_S-AJFiyWUfstiqar4-a_yt4n/view?usp=sharing, where the "g" in "vagyok" is "swallowed" (not fully pronounced). In VAPI, it sounds like Recording - https://drive.google.com/file/d/1_LHPCMZWzc6-mdo6jFIWPoy2b9yr3_La/view?usp=sharing, where the "g" is pronounced. This is just an example—overall, there is a difference in pronunciation throughout the entire phrase. The "S" sound is often pronounced like the American "S", among other issues. The first version (Recording 1) is correct. How can we ensure the correct pronunciation? Model: Модель: Eleven Flash v2.5 Voice ID: TumdjBNWanlT3ysvclWh Thank you in advance for your help!
k
Hey! To help track down this issue, could you share: - The call ID This would really help us figure out what went wrong!
k
Hi! Example here: Vapi id: a81637d9-ea97-4669-b698-51fda650b1e9
k
Hello Karina, can you try using Turbo 2.0 or 2.5? Also let me know if it reoccurs or just happened this one time.
k
Apologies, but I don’t think this will help—it’s a persistent issue that’s easy to reproduce. The same voices sound different on your platform and in 11Labs. I’ve attached all the recordings in the ticket—please take a look.
s
@Karina Krinitskaya looking into it.
k
@madoka6232
update
I'm checking with the 11labs team and will let you know any updates I receive. cc: @mason
@madoka6232 I checked with 11labs, and here's what they have responded(for fair and transparent support I am putting their message as it is): https://cdn.discordapp.com/attachments/1346436410495729664/1351230842709741600/Screenshot_2025-03-17_at_9.57.12_PM.png?ex=67d99f49&is=67d84dc9&hm=9fc070acdeb92b596fc1395febc4b85843c7f671fc530638d853517473925693&
k
I changed the model, but nothing changed:( eleven_turbo_v2_5 Vapi call id: 819db5a4-14d1-4c5e-b68e-550b34111504 https://storage.vapi.ai/819db5a4-14d1-4c5e-b68e-550b34111504-1742303734648-8c5a8335-ed5c-47ab-8dc7-791284af7745-stereo.wav eleven_flash_v2_5 Vapi call id: 429aa4a1-c83b-4119-b2f9-d9d018becf99 https://storage.vapi.ai/429aa4a1-c83b-4119-b2f9-d9d018becf99-1742304031467-0560251e-84fa-462d-8c13-2099597fb2e8-stereo.wav
The issue is that everything sounds correct on 11Labs, but the problem appears after it gets to the Vapi platform.
s
@Karina Krinitskaya For the shared call IDs, I do see that the multilingual model is being used (as you can see in the below log). I suggest making another calls with the Turbo and Flash model.
logs
🔵 13:20:19:411 ElevenLabs (WebSocket #1) Connecting... URL: wss://api.elevenlabs.io/v1/text-to-speech/TumdjBNWanlT3ysvclWh/stream-input?output_format=pcm_16000&optimize_streaming_latency=3&model_id=eleven_multilingual_v2&enable_logging=false&enable_ssml_parsing=true
k
Hi If we do hidden change the model to Multilingual during the call, according to your logs. why is this not reflected on your dashboard? Because if you look at the dashboard, it will show the Flash model.
k
After going through logs again and seeing your model config, I can see that something could be going wrong here. I'll put your ticket under higher priority and status to investigation. Let me get back to you with resolution.
Karina, I looked into both of the shared call IDs. Their assistant ID creation time, end reason, and the model being used all of them say they were using a multilingual model. Even when I went to into the data logs, there was mention of a multilingual model. After that, I went to the logs again. Looked for update assistant queries, and I do see requests for updates made after that on March 19th and 18th for updating the assistant to Turbo V2.5 model. But at the time these calls were made, the assistants were using a multilingual model only. As you will look into the first screenshot it will show you when the assistant was updated after that it will show you the two calls start time according to your time zone and then it will then you can see the cost being associated for the two calls as multilingual modal and then finally you see that your assistant query where it's returning Turbo model So after looking at every like each logs and everything the conclusion is the time call was made model has the multilingual model not the turbo or the flash model. I will suggest you to make 2-3 test calls again with the Turbo or Flash model, and let me know. Happy to help, your ticket is still open with me, and I believe same is happening with your other ticket as well. https://cdn.discordapp.com/attachments/1346436410495729664/1353707397650059274/Screenshot_2025-03-24_at_5.56.41_PM.png?ex=67e2a1c2&is=67e15042&hm=7a17e9f28100c4b17b50bcd41df27ee8f1dc9288dbb0dcdabb7f7e5f560b2b7f& https://cdn.discordapp.com/attachments/1346436410495729664/1353707397868290078/Screenshot_2025-03-24_at_5.56.58_PM.png?ex=67e2a1c2&is=67e15042&hm=28477bd2d9eb32ee02b9f04d16a48b97a483e1ef607094ca7bea48d42885b83c& https://cdn.discordapp.com/attachments/1346436410495729664/1353707398069747753/Screenshot_2025-03-24_at_5.57.10_PM.png?ex=67e2a1c2&is=67e15042&hm=292a29e2d302ad22647f3c7be3a937716462f7444db40c00c474b813c57f2f3f& https://cdn.discordapp.com/attachments/1346436410495729664/1353707398375673917/Screenshot_2025-03-24_at_5.57.31_PM.png?ex=67e2a1c2&is=67e15042&hm=235906a085800e9d86820e7f904be27936d91292c500b6e1928ea01f397d49cf& https://cdn.discordapp.com/attachments/1346436410495729664/1353707398661013566/Screenshot_2025-03-24_at_5.57.51_PM.png?ex=67e2a1c2&is=67e15042&hm=2e10dd16d6275c4c523eefa2bc52a839bf3bfcc6ceedbcefaaec0858730b3345&
If you require it, I can share the complete logs in DM with you.