My number is ...
# support
k
I've probably spent more than 10 hours in vain trying to figure out how to reliably capture phone numbers during a phone call. For example, if the customer asks for a callback and provides a phone number. The problem with this is that the telephone number provided by the customer is usually translated correctly speech-to-text, but is then often reproduced incorrectly by swapping, omitting or inventing digits. For example: System Promt: "Your number is 0790990. Please provide it digit by digit." Bot Message: "My number is zero seven nine zero nine zero." A nine is missing! I tried out different models, languages ​​and voice providers. It seems to work better in English, but in other languages ​​it is even more prone to errors - especially when two identical or similar numbers follow one another. Cartesia seems to me to be particularly susceptible to this, but ElevenLaps is not free from it either. ExampleID: 7b7a621e-0a83-4478-8c0e-9745e887b1e9 I hope there is a trick or a clever prompt suggestion or a setting that I can make to make the requesting and recording of numbers and/or telephone numbers more reliable - otherwise my assistant will produce too many dissatisfied customers who wait in vain for a call back 😦 ... Please help! Side thought: I saw that there is the Predefined Function "Dial Keypad". Is there also a "Listen keypad" function?
v
@KarlKing I understand the challenges you're facing with capturing phone numbers accurately during a call. It's crucial to ensure that the numbers are recorded correctly to maintain customer satisfaction. Here are a few suggestions that might help improve the accuracy of capturing phone numbers: 1. **Clear Instructions**: Provide clear instructions to the caller to speak each digit of their phone number slowly and separately. For example, "Please say each digit of your phone number one at a time, with a pause in between." 2. **Confirmation**: After capturing the number, repeat it back to the caller for confirmation. If there's a mistake, ask them to provide the number again. 3. **Digit-by-Digit Confirmation**: Instead of asking for the entire number at once, you could prompt the user to provide one digit at a time and confirm each digit before moving on to the next. 4. **Error Handling**: Implement error handling where the bot asks for a repeat of any digits it is less confident about. 5. **Multi-Language Support**: Ensure that your speech-to-text service is optimized for the languages you're using. Some services may require additional training or configuration for specific languages. Regarding your side thought, while there isn't a "Listen Keypad" function in the same sense as "Dial Keypad," you can design your voice bot to listen for DTMF tones (the sounds made by pressing the keys on a telephone) if your telephony provider supports it. This can be a more reliable way to capture input, as it doesn't rely on speech recognition. If you're interested in exploring the DTMF approach or need further assistance with improving the speech-to-text accuracy for phone numbers, please let me know, and I can provide more detailed guidance or check if there are any specific settings or functions available in VAPI that could assist with this issue.
You can continue this conversation by mentioning me in the message.
t
You need to set up server url to get status-update ... In the status update playload, you will find the calling Number in call section. You can return that info as system message using the controlURL (which is also in the same playload) .. This way you don't need to ask the customer for the phone anymore
k
Thank you for the tip! But the problem still exists if, for example, the customer calls from the office and asks to be called back later on a pivate phone. Or on an outbound call, when the secretary says you can reach the boss on his mobile phone, his number. is ... It's not just about the phone number, it could also be the request for the customer number or something similar that is translated incorrectly.
t
Well .. in that case, you should focus on finding the right prompts to insure that the number is correct with a confirmation process ..
k
If the customer input is correct (in my tests it was even stored in the System Promt), but the output is reproduced in a modified form, then no verification process makes sense.
m
Try turning on Numerals in Deepgram (VAPIs STT): https://developers.deepgram.com/docs/numerals Also English is preferred for this setting.
k
Thanks for the tip! Numerals, I didn't know that yet. However, this also relates more to speech-to-text optimization - but I suspect the error is more likely to be in the speech output. ElefenLaps tends to speak one digit too few, Cartesia often reverses the last digits or adds digits - although the customer information was previously understood correctly according to the transcript. This can also be reconstructed directly on the Cartesia website - I have attached my settings for this. Have Cartesia say the numbers several times and you will realize, even if you don't speak German, that Cartesia doesn't always pronounce the same numbers to you. This probably only affects the German or non-English language editions, otherwise the problem would have cropped up here a lot more often - or have I just overlooked it so far? https://cdn.discordapp.com/attachments/1291828380295102506/1291957066277326848/grafik.png?ex=6701fc48&is=6700aac8&hm=d81af64797a7f23a8ece004ae5b0e1e162bfc06c8ac411c8f3e7746add0b2607&
s
@KarlKing you can join discord OH and get your ticket resolved over there.
@KarlKing you can either use
formatPlan.replacement
to replace numbers with words or add
hypens(-)
b/w numbers to get it voiced out correctly.
k
Thanks for the pointers. I have already replaced the numbers with words and hyphens provide a short pause in speech, but do not prevent Cartesia from occasionally inventing or twisting numbers. I just tested this again and found that this bug also exists with the English-language Cartesia voices. Go to Cartesia Playground, have Cartesia say the same number twice, e.g. "3 3" or "three three" or "three - three" and count how many times cartresia actually says "three". Sometimes twice, as instructed. But sometimes (very often) Cartesia only speaks "three" once or more than twice, so that the default "three three" often turns into a spoken "3" or a "3 3 3" or "3 3 3 3 " . This problem is therefore not solved as stated, but is probably more likely to be caused by Cartesia than by Vapi. However, all Vapi/Cartesia conversations are probably affected by this.
s
can you share the call id along with timestamp, I can try contacting them.
2 Views