What transcription and voice models are you all us...
# general-english
j
What transcription and voice models are you all using? We were using the default PlayHT jennifer for a while and loved it, until we felt like we couldn't help improve her pronunciation (dollar amounts, years, acronyms). We've since moved to OpenAI voices (echo, shimmer) but they feel super robotic and slow, even when using the
instructions
attribute available in the Vapi API when using an OpenAI voice. We've been considering making the jump to 11Labs from Vapi since the pronunciation control and latency seem to be much better there (yes, I know it is bc they can use a STS model instead of orchestrating all three parts that Vapi does).