e.g. by not sending the whole conversation on each caller's answer?
v
Vapi
11/19/2024, 5:22 PM
@Manuel Saelices
Absolutely, we can optimize the process by not sending the entire conversation after each caller's response. Instead, we can send incremental updates or wait until the end of the call to send a complete summary. This approach can save bandwidth and reduce processing overhead. If you need assistance with implementing this or have any specific requirements, please let me know, and I'll be glad to help!
Vapi
11/19/2024, 5:22 PM
You can continue this conversation by mentioning me in the message.
Vapi
11/19/2024, 5:23 PM
Got it. Tagging @User for additional help.
m
Manuel Saelices
11/19/2024, 5:25 PM
This is not how this works. On each interaction with the caller, after the caller says anything, VAPI sends the whole conversation, including the system messages, past function calls, etc, to the LLM, to get the new message for the Bot to say or the following function call to call, etc
The second question is about some other mechanism we can use to save LLM costs, e.g. with a context window, e.g. not sending the first messages to the LLM when the conversation is long or something like that