Pronunciation of alphanumeric IDs Vapi AI #support

Pronunciation of alphanumeric IDs

kev3522

01/16/2025, 8:19 PM

Using 11labs for TTS, I'm not able to reliably get the assistant to pronounce alphanumerics slowly. For example, if I have an alphanumeric like

A12B34C56

, I want it to spell it out slowly, ideally with a brief pause every 3 characters, so the recipient can understand. On 11labs' dashboard, I'm able to get the speech to do exactly what I need by formatting

A12B34C56

. as

A-1-2...B-3-4...C-5-6

With a Vapi assistant set up with the same 11labs voice with same config, the assistant behaves completely differently. It'll say

A-1-2...B-3-4...C-5-6

as "A minus 1 minus 2 B minus 3 minus..." What's going on here? Any recommendations on how to fix this? I'd also like to know what the best way to debug the text being passed on to 11labs. I can see logs of the transcript generated after the speech, but I don't see any logs of the text passed from the LLM to 11labs.

Vapi

01/16/2025, 8:21 PM

@kev3522 It sounds like you're encountering a discrepancy between the way the 11labs TTS engine processes text when used directly through their dashboard versus when it's integrated with the Vapi assistant. This could be due to differences in the way text is formatted or processed before being sent to the TTS engine. To address the pronunciation issue, you might need to adjust the text formatting specifically for the Vapi assistant's use case. Here are a few suggestions: 1. Experiment with different punctuation or spacing to achieve the desired pause and pronunciation without implying subtraction, such as using commas or periods:

A, 1, 2. B, 3, 4. C, 5, 6.

2. Use SSML (Speech Synthesis Markup Language) tags if supported by 11labs to control the speech output more granularly, including pauses and phonetic pronunciation. 3. If the Vapi assistant allows for pre-processing of text before sending it to the TTS engine, you could implement a custom formatting function to ensure the text is passed in the desired format. Regarding debugging, if you're not seeing logs of the text passed from the LLM to 11labs, it's possible that this logging feature is not enabled or not available in the current setup. You may need to: - Check if there's an option to enable more detailed logging within the Vapi assistant's configuration. - If detailed logging is not available, consider implementing a logging mechanism that captures the text before it's sent to the TTS engine. If these suggestions don't resolve the issue, or if you're unsure how to implement them, please reach out to support for more detailed assistance. They may be able to provide specific guidance based on the Vapi assistant's capabilities and configuration options.

Vapi

01/16/2025, 8:21 PM

You can continue this conversation by mentioning me in the message.

Vapi

01/16/2025, 8:21 PM

Got it. Tagging @User for additional help.

kev3522

01/17/2025, 7:25 PM

I was able to fix this - disabled

formatPlan

https://docs.vapi.ai/api-reference/assistants/create#request.body.voice.11labs.chunkPlan.formatPlan

Previous Next