4o-mini Realtime -> Reducing cost, providing KB Vapi AI #support

4o-mini Realtime -> Reducing cost, providing KB

tom

01/15/2025, 10:49 AM

Hi, I'm curious on how to optimize usage of the mini-realtime model. Using transient assistant via Make.com. I'm trying to minimize cost and find a solution for KBs. 1) Since with a longer prompt it gets expensive quite quickly.. Can we "outsource" the prompt? For example that it pulls the prompt via Make.com from a Google Doc? Would that reduce cost? 2) What do you guys recommend as a solution for KBs? Same approach as in point 1, that it just pulls the KB from a Google Doc/Airtable/etc? Thx guys appreciate any input

Vapi

01/15/2025, 10:51 AM

@tom Hello! I'm glad to hear you're looking into optimizing your usage of the mini-realtime model. Let's address your questions: 1) **Outsourcing the Prompt**: Yes, you can indeed "outsource" the prompt to an external service like Google Docs using Make.com (formerly Integromat). By storing the prompt externally and pulling it in when needed, you can reduce the size of the request you send to the API, which can help manage costs. However, the cost savings would depend on the pricing structure of the API you're using. If the API charges by the amount of data processed or by the number of requests, this could be beneficial. 2) **Solution for Knowledge Bases (KBs)**: Using an external service like Google Docs or Airtable to store your KBs is a smart approach. You can structure your KBs in these services and then use Make.com to retrieve the necessary information when needed. This way, you can keep your API requests lean and focused on processing the user's query rather than handling large amounts of static data. For both points, you would need to set up a workflow in Make.com that: - Triggers when a request is made. - Retrieves the necessary data from your external storage (Google Docs, Airtable, etc.). - Sends the data to the mini-realtime model. - Processes the model's response. - Returns the response to the user. This setup allows you to manage your prompts and KBs more efficiently and can help in minimizing costs by reducing the payload size for each API request. Keep in mind that you'll need to ensure that the retrieval of data from external sources is done quickly to maintain a good user experience, as any delays could impact the perceived performance of your voice application. If you need further assistance with setting this up or have more questions, please feel free to ask!

Vapi

01/15/2025, 10:51 AM

You can continue this conversation by mentioning me in the message.

tom

01/15/2025, 11:06 AM

@User @User Thoughts from you guys?

Shubham Bajaj

01/15/2025, 11:04 PM

@tom 1) You cannot outsource your prompt instead you can instead try reducing your prompt token length. 2) You can use our KB solution, which you can learn more about it from here https://docs.vapi.ai/knowledgebase Do let me know if you require further help.

Previous Next