Tedros
11/16/2024, 9:40 AMtranscriber: {
model: "whisper",
mode: "transcribe", // set mode to transcribe rather than the default of translate
}
I also tried the below configuration with hidden parameters but it had the exact same behavior:
transcriber: {
model: "whisper",
mode: "transcribe", // Keeps the transcript in the original language
language: null, // Ensures autodetection
multilingual: true // Enables multiple languages
}
I found the following documentation for TalkScriber’s Whisper model that confirms the purpose of the mode parameter:
parser.add_argument("--task", type=str, default="transcribe", choices=["transcribe", "translate"], help="whether to perform X->X speech recognition ('transcribe') or X->English translation ('translate')")
parser.add_argument("--language", type=str, default=None, choices=sorted(LANGUAGES.keys()) + sorted([k.title() for k in TO_LANGUAGE_CODE.keys()]), help="language spoken in the audio, specify None to perform language detection")
This documentation suggests:
• Transcribe: Transcribes audio in the detected or selected language.
• Translate: Translates audio from the detected or selected language to English.
Based on this, “transcribe” mode should preserve the spoken language, but in practice:
• When I speak in Spanish without explicitly setting a language, the transcript starts in Spanish for a couple of seconds and then switches to English.
• Explicitly setting the language parameter works to maintain the spoken language, but that negates the value of auto-detection and using “transcribe” mode.
It seems the issue could be caused by one of the following:
1. A default language (possibly English) in VAPI’s implementation overriding “transcribe” mode.
2. The mode parameter being ignored or incorrectly configured.
Would it be possible to:
1. Remove any default language for “transcribe” mode or make language a required parameter to ensure accurate behavior?
2. Confirm that the mode parameter is functioning as intended to preserve the spoken language without auto-translating to English?
This issue is critical for us, as preserving the original language is essential for our multilingual applications. Let me know if you need any additional details, and thank you so much for your support!Vapi
11/16/2024, 9:43 AMVapi
11/16/2024, 9:43 AMShubham Bajaj
11/18/2024, 1:12 PMThey haven't provided any specific details about the issue or an ETA yet, but I’ll do my best to get an update from them. After reading through your emails, I can truly understand the frustration you're facing. While it may not be as ideal as your current suggestions, would it be possible to resolve the issue at the telephony level? By this, I mean routing the calls to language-specific assistants based on the caller's country of origin. For this, only the telephony system can expose the necessary APIs.
Let me know what you think.
Tedros
11/18/2024, 1:28 PMTedros
11/18/2024, 1:32 PMTedros
11/18/2024, 5:21 PMShubham Bajaj
11/18/2024, 10:17 PMShubham Bajaj
11/18/2024, 10:17 PMShubham Bajaj
11/18/2024, 10:19 PMTedros
11/18/2024, 10:20 PMShubham Bajaj
11/18/2024, 10:22 PM