Ben
01/17/2025, 1:48 PMmessages
show differently to the client messages I received and attached. It shows all of these "please wait" messages as a single message:
{
"role": "bot",
"time": 1737250366597,
"source": "",
"endTime": 1737104336499,
"message": "ちょっと待って。 ちょっと待って。 ちょっと待って。 ちょっと待って。 ちょっと待って。 経費精算の締め切りは翌月5日の正午です。 ただし、土日祝日の場合は翌営業日となります。 具体的には以下のスケジュールになっています。 対象期間。 当月一日末日。 提出期日翌月5日正午。 支払日翌月10日。 もし他にご質問がありましたら、遠慮なくお知らせくださいね。",
"duration": 5341100000,
"secondsFromStart": 146100
}
If you see the timestamp
here it is in the future, and 2 days AFTER the endTime
so it seems clear there is a bug here.
I can see there are warnings after the tool calls like this:
[LOG]
Model request started (attempt #1, gpt-35-turbo-1106, azure-openai, westus)
08:58:22:152
[WARN]
Model request taking too long (attempt #1, started 445.14ms ago)
08:58:22:152
[LOG]
Model request started (attempt #2, gpt-35-turbo-1106, azure-openai, canada)
So I suspect the duplicated messages are related to this issue: https://discord.com/channels/1211482211119796234/1323288623629471866/1324798624525451367
The LLM is getting slow response and so VAPI falls back to other regions maybe? And maybe it is keeping the response of the "slow responses" rather than correctly timing them out and discarding them, resulting in the model repeating itself several times?
In summary looks like two issues:
- A bug in the timestamps (reported to be 2 days in the future)
- An issue maybe where VAPI retries due to timeout but does not discard late response from timed out calls?
https://cdn.discordapp.com/attachments/1329809513812987958/1329809514085482537/call_id_6c3d83d1-af38-4862-959c-3f01d4bdcc48.json?ex=678bb11e&is=678a5f9e&hm=8cb1ba93fa7f4b752a41f267bb0b5fb7d63b9ec4818f513eb026174ad1d6187d&Vapi
01/17/2025, 1:50 PMtimestamp
should not be in the future, nor should it be after the endTime
. This discrepancy definitely needs to be investigated by our technical team.
Secondly, the issue with the duplicated messages and the warnings in the logs suggest that there might be a problem with how VAPI is handling slow responses from the LLM. It appears that VAPI is not correctly timing out and discarding the late responses, which could be causing the model to repeat itself.
To address these issues, I would recommend reporting them to our technical support team so they can investigate and resolve the bugs. In the meantime, if there's anything else I can assist you with, please let me know.Vapi
01/17/2025, 1:50 PMVapi
01/17/2025, 1:50 PMShubham Bajaj
01/18/2025, 4:17 AMShubham Bajaj
01/20/2025, 2:09 PMShubham Bajaj
01/20/2025, 2:13 PMexpected
json
{
"results": [
{
"toolCallId": "X",
"result": "Y"
}
]
}
Ben
01/20/2025, 2:33 PM1737250366597
was reported as time
(Sunday, 19 January 2025 01:32:46.597)
I believe in the above case, it was gpt3.5 turbo in use - would you like me to re-run it with gpt4?
This particular issue I've only seen occasionally, so I'm not sure if I can reproduce it easily. The temperature in use was the default - so it was already set to 0 to my knowledge.
The response at the time was formatted like this:
response = {
"results": [
{
"toolCallId": parsed_response["tool_call_id"],
"result": {"status": "success",
"message": result,
"tool_call_arguments": arguments}
}
]
}
i.e. I had an object as the response not a string. I saw today [here](https://discord.com/channels/1211482211119796234/1329542939160084624/1329910047081500712) that at least for the Realtime API it must be a pure string. I'm a little unclear what the suggested practice is if we wish to return structured data to the LLM to be put into its context without speaking it directly as is often the case in a RAG setting?
Would it be something like this?
"results": [
{
"toolCallId": "<insert-your-tool-call-id-here>",
"result": "<JSON stringified response data here>",
"message": {
"type": "request-complete",
"role": "assistant",
"content": ""
}
}
]
}
If there is some documentation explaining (or a code reference) what OpenAI API call input the VAPI middleware uses to insert the tool call result into the LLM, then that would likely answer my query above
I see the docs you linked say "It could be a string, a number, an object, an array, or any other data structure that is relevant to the tool’s purpose."; so it's clear the result can be an object in some cases at leastShubham Bajaj
01/20/2025, 3:06 PMjson
results": [
{
"toolCallId": "<insert-your-tool-call-id-here>",
"result": "<insert-tool-call-result-here>",
"message": {
"type": "request-complete",
"role": "assistant",
"content": "<insert-the-content-here>"
}
}
]
}
The message content will be voiced or spoken out, while the result will be used in the LLM context. We expect the result to be in string form without any line breaks, as an object might break it and cause "no_results_found" to appear in the message history.
> I see the docs you linked say "It could be a string, a number, an object, an array, or any other data structure that is relevant to the tool’s purpose."; so it's clear the result can be an object in some cases at least
Yes, but it has to be encapsulated inside the string, which you can verify in the screenshot above.
> If there is some documentation explaining (or a code reference) what OpenAI API call input the VAPI middleware uses to insert the tool call result into the LLM, then that would likely answer my query above
Yes, it's a tool call result being used. The result will be fed to the LLM, and if there is message content, then the content is voiced out and the result still will be used in LLM messages or completion history.
https://cdn.discordapp.com/attachments/1329809513812987958/1330916291296628736/Screenshot_2025-01-20_at_20.32.21.png?ex=678fb7e3&is=678e6663&hm=727df948c89e203f8acf6c26e94e41f3ba040cf61ff5a58484779822c72e7b7a&Shubham Bajaj
01/20/2025, 3:06 PMBen
01/20/2025, 3:10 PMShubham Bajaj
01/20/2025, 3:12 PMBen
01/20/2025, 3:15 PMShubham Bajaj
01/20/2025, 3:19 PMBen
01/20/2025, 3:20 PMShubham Bajaj
01/20/2025, 3:20 PMBen
01/20/2025, 3:20 PMShubham Bajaj
01/20/2025, 3:25 PMBen
01/20/2025, 3:30 PMShubham Bajaj
01/20/2025, 3:47 PMShubham Bajaj
01/21/2025, 7:59 PMBen
01/23/2025, 11:38 AM