call.in-progress.error-vapifault-custom-llm-llm-failed Vapi AI #support

call.in-progress.error-vapifault-custom-llm-llm-fa...

slickVik

04/07/2025, 6:33 PM

47163e4b-1af6-4552-bb79-7eb4151cde40 fe8618f1-bfb3-44f5-b02f-00b3ed0f06a0 d981ce4f-84d0-45d4-a64a-78346774f0b5 d9e0270d-9e46-4fe8-91d2-d0b13e0ce798 @Shubham Bajaj can you help us debug what when wrong with these calls? They all had the same error and we see this issue failry often.

Vapi Support Bot

04/07/2025, 6:35 PM

Hi there! 👋 I see you're encountering the

call.in-progress.error-vapifault-custom-llm-llm-failed

error across multiple calls. While I can't debug those specific call IDs without more information, I can definitely help explain what this error means and provide some general debugging steps that might help resolve the issue. ## What this error means The error

call.in-progress.error-vapifault-custom-llm-llm-failed

indicates a failure related to your custom LLM integration with Vapi. According to [information from Discord](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1335181348285054996), this error may be related to: - LLM failures - Issues with custom LLM usage - Token limits being exceeded - Errors from services like 11Labs ## Possible causes There are several potential causes for this error: 1. **LLM Failures**: Service outages, network issues, or problems with specific requests being made to the LLM. 2. **Custom LLM Configuration Issues**: Your LLM key might not be correctly configured or might have exceeded usage limits set by the LLM provider. 3. **Token Limits**: The error can be triggered if token limits are being exceeded or if there's incorrect token calculation. 4. **Incompatible Response Format**: If your custom LLM response doesn't match the expected OpenAI-compatible format, particularly if it's missing the "choices" array property or if the property is not an array. 5. **Tool Use and Result Formatting**: Incorrectly formatted or sequenced tool use and tool result blocks. 6. **Model-Specific Issues**: If you're using 11labs flash_model, [one Discord thread](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1335277468805758986) suggests changing to the turbo model might help. ## Debugging steps Here are some steps you can take to debug this issue: 1. **Review configurations and logs**: - Check your server logs for more detailed error messages

Vapi Support Bot

04/07/2025, 6:35 PM

- Review any recent changes made to your custom LLM integration - Verify the assistant's logic for providing feedback after tool calls 2. **Check your custom LLM setup**: - Ensure your custom LLM is correctly integrated with VAPI - Verify your API keys are correct - Check LLM key configuration, usage limits, and token limits - If using a local server, verify the Ngrok URL and test the connection 3. **Verify tool calling and function execution**: - Ensure that tool use and tool result blocks are correctly formatted and sequenced - Double-check that responses from tool uses are being processed correctly - If transferring calls, ensure the

transferCall

tool is correctly defined and triggered 4. **Consider LLM provider specifics**: - If using 11labs, try changing from flash_model to turbo - If using Azure OpenAI, check for interference between Azure OpenAI deployment parameters and VAPI assistant settings Is there any specific part of your custom LLM integration that you'd like me to provide more information about? Sources: - [Discord Thread about error meaning](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1335181348285054996) - [Discord Thread about 11labs model issue](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1335277468805758986) - [Discord Thread about token limits](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1323288623629471866) - [Discord Thread about tool formatting](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1331340140064407573)

slickVik

04/07/2025, 6:36 PM

@malik

Shubham Bajaj

04/07/2025, 11:13 PM

🔵 21:38:35:644 Completion Request Timeout (#1, provider: custom-llm, model: gpt-4o, region: undefined, credential: true) 🔵 21:38:35:646 Couldn't Complete Model Sequence. Request Timed Out And Sequence Empty (provider: custom-llm, model: gpt-4o, region: undefined, credential: true) 🔵 21:38:15:644 CustomLLMRequest Messages: [ { "role": "assistant", "content": "Hola, soy mía. ¿En qué puedo ayudarte hoy?" }, { "role": "user", "content": "Quería hacer una cita." }, { "role": "assistant", "content": "Claro, ¿estás buscando ver a algún proveedor en específico?" }, {"role": "user", "content": "No."}, { "role": "assistant", "content": "Perfecto. ¿Eres un paciente nuevo o ya has visitado la clínica antes?" }, {"role": "user", "content": "No, no soy nuevo."}, { "role": "assistant", "content": "Gracias. ¿Tienes algún día o horario en mente que te funcione mejor para la cita?" }, { "role": "user", "content": "Pues, si habría hoy o el día que sea." }, { "role": "assistant", "content": "Hoy no hay disponibilidad, pero el miércoles dieciséis de abril doctor Patrick Pack tiene espacio a las tres quince pm. Si prefieres algo más temprano, jueves diecisiete de abril hay un horario disponible a las once quince am. ¿Te interesa alguno de estos?" }, {"role": "user", "content": "De temprano."}, { "role": "assistant", "content": "Perfecto, antes de confirmar la cita, ¿me podrías proporcionar tu nombre completo, número de teléfono y fecha de nacimiento?" }, { "role": "user", "content": "Sí, es siete catorce tres seis tres veinte seis nueve. Fecha de nacimiento es enero diecisiete dos mil doce." }, { "role": "assistant", "content": "Gracias. ¿Me podrías confirmar tu nombre completo, por favor?" }, {"role": "user", "content": "Ross."}, { "role": "assistant", "content": "Gracias, Ross. Para confirmar, ¿estás agendando una cita con el doctor Patrick Bach el jueves diecisiete de abril a las once quince AM. ¿Es correcto?" }, {"role": "user", "content": "Correcto."}, { "role": "assistant", "content": "Dame un momento para agendar tu cita. Necesito cita. Necesito tu apellido para completar la cita. ¿Me lo podrías proporcionar, por favor?" }, {"role": "user", "content": "Ok. It's wings."} ]

Shubham Bajaj

04/07/2025, 11:15 PM

@slickVik As you can see after a number of requests, your server takes a bit long to respond, which eventually results in a timeout error and then marked as custom LLM failed.

slickVik

04/07/2025, 11:21 PM

I see

slickVik

04/07/2025, 11:21 PM

This was my suspicion as well

slickVik

04/07/2025, 11:25 PM

I think I’ll have to try some sort of retry after a 10 seconds. How do you guys achieve your feature where you say a filler phrase if the llm takes too long? Is a timer that runs in the background?

slickVik

04/07/2025, 11:25 PM

@Shubham Bajaj

Kyle

04/08/2025, 11:06 PM

Yes, we create timers before requests. If the time expires without a response, we notify the user of the delayed message or request.

axelmylle

05/08/2025, 7:18 PM

Documentation examples are not up to date on this topic. I got it working on a cloudflare worker. Only the ```openai-advanced```file was succesful when copy pasting. https://github.com/VapiAI/example-server-serverless-cloudflare/blob/main/src/api/custom-llm/openai-advanced.ts

Kyle

05/10/2025, 3:55 AM

This ticket has been marked as solved, and as a result, you will no longer receive any further responses. Kindly create a new support ticket to escalate your inquiry.

2 Views

Previous Next