Problems with the Norwegian accent Vapi AI #support

Problems with the Norwegian accent

Karina Krinitskaya

03/19/2025, 12:31 PM

Recently, I wrote about an issue with Hungarian, and now we have a problem with the pronunciation of the Norwegian language. When listening to the language on 11labs, everything sounds good, with clear pronunciation of all words. Recording: [Google Drive Link](https://drive.google.com/file/d/1bKNgZ_2CADIOdLhKwHnS0DJ2kB7WMPmj/view?usp=sharing) However, on a Vapi call with the same model (Eleven Flash v2.5), we hear something entirely different, especially at the end of the phrase. Recording: [Vapi Call Recording](https://storage.vapi.ai/f1c068ea-4677-4ec2-9447-d329452af9cb-1742387009899-ebac2a96-3144-4b47-a878-fc2102e9bf03-stereo.wav) - Vapi call ID: f1c068ea-4677-4ec2-9447-d329452af9cb - Model: Eleven Flash v2.5 (We also tried Turbo 2.5, no difference) - Voice ID: b3jcIbyC3BSnaRu8avEk - Bergen - Test phrase Hei! Takk for at du ringer Sorrissi, jeg heter Erik. Hvordan kan jeg hjelpe deg i dag? We need Norwegian without a Danish accent, as heard in the 11labs recording. Why does it sound correct on 11labs but not on the platform? Please help, this is a serious issue. I previously encountered a similar problem related to the Hungarian language. https://discord.com/channels/1211482211119796234/1346436410495729664

Shubham Bajaj

03/20/2025, 1:20 PM

@Karina Krinitskaya Sorry for your experience. I have shared your Google call recording, VAPI call recording, and other details with the 11labs team and FBI. For this call, you are using a multilingual model, not Flash model, and your assistant current 11labs model is turbo.

Sarthak

03/20/2025, 1:35 PM

Response from Inlama Labs team sharing here as it is for fair consideration and transparency. https://cdn.discordapp.com/attachments/1351895867648507954/1352274345929605231/Screenshot_2025-03-20_at_7.04.50_PM.png?ex=67dd6b20&is=67dc19a0&hm=875d964814b026faa9b5567e8cb847fca29a9bb324557941dccca1a9885b591c&

Shubham Bajaj

03/20/2025, 1:35 PM

Eleven Labs*

Karina Krinitskaya

03/21/2025, 1:07 PM

Karina Krinitskaya

03/21/2025, 1:07 PM

Thank you for your response, but I am using Eleven Flash v2.5, as clearly shown here: Assistant ID: 7de2b370-01c3-4bd7-9af5-cfc7e92c338b Here is the result of the GET request, where we can clearly see that the model used is **Eleven Flash v2.5**: Screenshot: https://drive.google.com/file/d/1DSdx4JnzcFsd2hqRoaHi33Yx8Vazme7M/view?usp=sharing VAPI Call ID: d5a4c538-161f-4df3-8d37-bc345690c78f Recording: https://drive.google.com/file/d/1aXNrqTzNN3fTPdo1y5whuwRywdjxmJ8I/view?usp=sharing Another issue has been noticed: When creating an assistant on the VAPI platform with the same Voice settings, it sounds different from the one we use with the custom LLM (as heard in the recording above). Assistant ID: 40c478f8-5596-4b85-9472-c521494af84a Here is the voice recording of the assistant I created from scratch directly on the VAPI platform: https://drive.google.com/file/d/19w4okw55tIKOMhfCfjC2Omxcb-Dn_eIX/view?usp=sharing The second recording is better, but it still does not match the reference recording (attached to the ticket).

Sarthak

03/23/2025, 7:20 AM

Hey Karina, For the shared call ID, the model which was used is a multilingual model. Below I'm sharing the logs which you are sending to 11labs stream. You can take a look. It's not Flash model, and they have also confirmed the same. 🔵 12:42:31:876 ElevenLabs (WebSocket #0) Connecting... URL: wss://api.elevenlabs.io/v1/text-to-speech/b3jcIbyC3BSnaRu8avEk/stream-input?output_format=pcm_16000&optimize_streaming_latency=3&model_id=eleven_multilingual_v2&enable_logging=false&enable_ssml_parsing=true 🔵 12:42:32:096 Getting Speech For

11labs:b3jcIbyC3BSnaRu8avEk:0.5:0.75:undefined:undefined:undefined:eleven_multilingual_v2:16000:c3273a0022fe23fb110bd32a214f97b391b12bfbeeed1a6676b36ea774e0199d

, "88" Text: ""Hei! Takk for at du ringer Sorrissi, jeg heter Erik. Hvordan kan jeg hjelpe deg i dag?""

Sarthak

03/23/2025, 7:25 AM

for fair consideration, keeping your ticket open under high priority for now to see if something is on our side, on 11labs side, or on your side. So, I am considering all three options. https://cdn.discordapp.com/attachments/1351895867648507954/1353268266318954566/Screenshot_2025-03-23_at_12.53.15_PM.png?ex=67e108c9&is=67dfb749&hm=80022e019f9bdf2b8c9e8e6fe607214649c0993b750869a2a76cfa62daee7e3a&

Sarthak

03/24/2025, 12:57 PM

Hey Karina, As we discussed in your other ticket, at the time the call was placed, the model being used was a multilingual model. Do you want me to continue the process over here as well? Or would you prefer me to share the complete logs? I'm happy to share it over here as well if required.

fyi

, try making 2-3 test calls. It should use the updated 11labs voice model.

august

03/25/2025, 12:55 PM

@Shubham Bajaj Hey Let me jump into the conversation — Karina and I double-checked this case. A call with call_id

e4eefb33-aeca-4cee-ad2f-6299aeb12a30

was made using the

7de2b370-01c3-4bd7-9af5-cfc7e92c338b

assistant directly on the VAPI. The assistant’s settings explicitly specify the

Eleven Flash V2.5

model, and I can also retrieve this model via the request (we’re also taking into account that information on your platform is often displayed incorrectly):

Copy code

GET https://api.vapi.ai/assistant/7de2b370-01c3-4bd7-9af5-cfc7e92c338b

200 OK
Response body: {
    "id": "7de2b370-01c3-4bd7-9af5-cfc7e92c338b",
    ...
    "name": "vapi_caller",
    "voice": {
        "model": "eleven_flash_v2_5",
        "voiceId": "b3jcIbyC3BSnaRu8avEk",
        "provider": "11labs",
        "stability": 0.5,
        "similarityBoost": 0.75,
        "enableSsmlParsing": true,
        "inputReformattingEnabled": false,
        "inputPreprocessingEnabled": true
    },
   ...
    "firstMessage": "Hei! Takk for at du ringer Sorrissi, jeg heter Erik. Hvordan kan jeg hjelpe deg i dag?",
    ...
}

I understand that the logs on your side show something different, but we’re also unable to get the expected result through your platform. So clearly, there’s some issue on your end — please look into this matter as soon as possible.

Shubham Bajaj

03/26/2025, 9:50 PM

@august @Karina Krinitskaya I just reviewed the shared call ID. Your current modal config is FlashModal, but it's falling back to Multilingual, which is why it's being used during the call. I'm check with the team why it's changing to Multilingual?

Shubham Bajaj

03/26/2025, 9:51 PM

According to me, it's likely because of the language, Flash model may not be supporting it, but still waiting for the team to respond.

Karina Krinitskaya

03/31/2025, 5:58 PM

Hi! Sorry, we are still waiting for your response. :)🐣

Sarthak

03/31/2025, 7:54 PM

Hey Karina, Sorry for taking time to get back to you. The issue has been fixed. There was a bug because of legacy code it was falling back to the multilingual model, and when I was going through the logs, it showed me that you guys are using the multilingual model despite the current model being the Flash model. Earlier, when the Flash model was released, the only English language transcription was possible, and we added a check so that users never run into call failures. Now, we have removed the check, so you can start using the Flash model.

august

04/01/2025, 9:57 AM

@Shubham Bajaj Double-checked — it’s working now, thank you!

Karina Krinitskaya

04/01/2025, 10:18 AM

@Shubham Bajaj Thank you so much! \[T]/

Sarthak

04/01/2025, 8:38 PM

Marking this ticket as Solved ✅

2 Views

Previous Next