audio to text switch? Vapi AI #support

audio to text switch?

Andy da Costa

10/04/2024, 9:19 PM

while we are aware of all the logs produced by vapi but wanted to hear of there is a way to switch a session from audio in/out mode into a text in/out mode. We are building an app that needs to communicate sensitive user information and the ability to switch to a text mode while on the bus for instance would be valuable to us. Currently while the LLM's support the text we are not aware of a way get text in our out of the vapi conduit. please let us know how we can use the api to temporarily disable the speech to text and text to speech modules to directly input text and get text responses of our vapi AI assistance.

Vapi

10/04/2024, 9:21 PM

@Andy da Costa Absolutely, I can help with that! VAPI is designed to be flexible and can accommodate different modes of interaction, including text-based exchanges. To switch from audio to text mode, you would typically adjust the settings in your application to bypass the speech-to-text and text-to-speech modules and directly handle text input and output. Here's a general approach to achieve this: 1. **Disable Speech-to-Text**: Instead of sending audio to VAPI for transcription, you would capture the user's input as text within your application's interface. 2. **Process Text**: Send the captured text directly to the VAPI service you are using for processing. This could be a chatbot service, a command processing module, or any other text-based service. 3. **Disable Text-to-Speech**: Instead of sending the VAPI's text response to a text-to-speech service, you would display the text response directly in your application's interface for the user to read. 4. **User Interface Adjustments**: Ensure your application has a user interface that allows users to easily switch between audio and text modes, and that it can display text inputs and outputs when in text mode. If you need specific commands or API endpoints to adjust these settings, you would typically find them in the documentation provided by the text-to-speech and speech-to-text service providers you are using. Since VAPI supports multiple providers, you would need to refer to the specific documentation for the services you have integrated with (PlayHT, 11labs, Rime-ai, Deepgram, OpenAI, Azure, Lmnt, Neets). If you require further assistance with the API calls or implementation details, please let me know which providers you are using, and I can provide more specific guidance. If this information is not sufficient, please ask support for more info on this.

Vapi

10/04/2024, 9:21 PM

You can continue this conversation by mentioning me in the message.

Shubham Bajaj

10/07/2024, 6:11 PM

@Andy da Costa you can join discord OH and get your ticket resolved over there.

chris.happy.dad

10/07/2024, 6:48 PM

Can you walk me through how this works from the api. If we have an assistant that is using audio and the user wants to type in confidential information, how do we switch this on the backend so the user can input text and press send it to vapi all in the same conversation? Once the confidential info is input, switch back to audio speech.

chris.happy.dad

10/07/2024, 6:49 PM

We already have the GUI working. This is a backend question (vapi api)

chris.happy.dad

10/07/2024, 6:50 PM

@Shubham Bajaj I tried your OH today and unfortunately there were audio difficulties from your side. Can you help on this? Thank you

Shubham Bajaj

10/07/2024, 7:33 PM

@chris.happy.dad create a new support ticket for your issue.

Shubham Bajaj

10/07/2024, 7:41 PM

for this you don't require Vapi BTS we are using raw llm models with your prompts.

7 Views

Previous Next