Custom LLM Integration and On-Prem Deployment for ...
# support
b
Hi VAPI team, We are currently evaluating VAPI as a potential solution for our generative conversational agents. At the moment, we use Custom LLM to integrate with speech engines through Websockets to a**chieve low latency and support for real-time interruptions**. We're interested in understanding how VAPI handles similar requirements since it doesn't seem to use Websockets for Custom LLM message exchange. Additionally, we have high demands for simultaneous calls and monthly minutes. We're particularly interested in the on-prem deployment option to improve security and performance. Could you clarify if your solution can meet these needs, especially regarding latency, interactivity, and scalability? Looking forward to your feedback. Thanks in advance!
v
@btrachez Hello! It's great to hear that you're considering VAPI for your generative conversational agents. While VAPI doesn't use Websockets for Custom LLM message exchange, it is designed to handle low latency and support real-time interactions efficiently. Our API is optimized for quick responses to ensure seamless conversations. Regarding your high demands for simultaneous calls and monthly minutes, VAPI is built to scale and can handle a large number of concurrent interactions. We understand the importance of reliability and performance in your operations. For the on-prem deployment option, VAPI does offer solutions that can be tailored to fit your security and performance requirements. On-prem deployment can indeed provide you with the control and customization you need for your specific use case. To address your concerns about latency, interactivity, and scalability in more detail, we can arrange a technical deep-dive session with our engineers. They can provide you with a comprehensive understanding of how VAPI can be integrated into your existing infrastructure and how it can meet your specific requirements. Would you like to proceed with scheduling a technical session, or is there anything else I can assist you with right now?
You can continue this conversation by mentioning me in the message.
s
1. With CustomLLM we expected you stream responses over HTTP connections 2. On-Prem Deployment and simultaneous Calls check your DM.
4 Views