@btrachez
Hello!
It's great to hear that you're considering VAPI for your generative conversational agents. While VAPI doesn't use Websockets for Custom LLM message exchange, it is designed to handle low latency and support real-time interactions efficiently. Our API is optimized for quick responses to ensure seamless conversations.
Regarding your high demands for simultaneous calls and monthly minutes, VAPI is built to scale and can handle a large number of concurrent interactions. We understand the importance of reliability and performance in your operations.
For the on-prem deployment option, VAPI does offer solutions that can be tailored to fit your security and performance requirements. On-prem deployment can indeed provide you with the control and customization you need for your specific use case.
To address your concerns about latency, interactivity, and scalability in more detail, we can arrange a technical deep-dive session with our engineers. They can provide you with a comprehensive understanding of how VAPI can be integrated into your existing infrastructure and how it can meet your specific requirements.
Would you like to proceed with scheduling a technical session, or is there anything else I can assist you with right now?