udAIan
01/07/2025, 11:23 PM[
{
role: "assistant",
content: "Hi! am i speaking with Donald Brasco?",
},
{
role: "user",
content: "Yup. You are.",
},
{
role: "tool_calls",
content: "Create_Cal_Meeting: {\n \"start\": \"2025-01-10T19:00:00Z\",\n \"attendeeName\": \"Donald Brasco\"}",
},
{
role: "assistant",
content: "One moment",
},
{
role: "tool_call_result",
content: "\"New Quote Review: Meet Organizer and Donald Brasco\"",
},
]
And api respond with the next message that assistant will produce?
I need this to program some evals for my assistants.Vapi
01/07/2025, 11:25 PMVapi
01/07/2025, 11:25 PMShubham Bajaj
01/08/2025, 11:36 AMudAIan
01/08/2025, 4:32 PM{
role: "user" | "bot" | "system" | "tool_calls" | "tool_call_result",
message: string,
// some other fields - skipping for brevity
}[]
While schema for messages for openai is:
{
role: "user" | "assistant" | "system" | "tool",
content: string,
// some other fields - skipping for brevity
}[]
There is difference in types of role and key content (openai) is different from message(vapi)udAIan
01/08/2025, 4:36 PMudAIan
01/09/2025, 4:11 PMShubham Bajaj
01/09/2025, 6:13 PMmessagesOpenAIFormatted you can find it any server-request body under call.artifactPlan.
https://cdn.discordapp.com/attachments/1326330347960533012/1326977105048637450/Screenshot_2025-01-09_at_23.39.11.png?ex=6781633b&is=678011bb&hm=95b4d67884af9f34214ae0e3e09b1b077c3da3a262cd545583ff0db038c31026&Shubham Bajaj
01/09/2025, 6:13 PMudAIan
01/09/2025, 6:20 PMShubham Bajaj
01/09/2025, 11:02 PMudAIan
01/09/2025, 11:36 PMmessagesOpenAIFormatted are like:
[
{
"content": "you are intelligent",
"role": "system"
},
{
"content": "Hello. Ask me the knowledge.",
"role": "assistant"
},
{
"content": "How does temporal work?",
"role": "user"
},
{
"content": "Temporal is an open source durable execution platform designed to simplify the creation of scalable distributed systems. It automates the saga pattern, which is essential for ensuring state in complex, multistep business transactions. Temporal abstracts away the complexity by autosaving application state for long running processes and handling failure and retry logic. Developers only need to focus on programming the business rules and compensation behavior within temporal workflows. While temporal takes care of scheduling. Retry behavior, and tracking progress. This makes implementing the saga pattern easy, safe, and reliable.",
"role": "assistant"
}
]
Last assistant message here leverages knowledge base (646be61c-5b1e-43f3-ab3a-493bee4d06d9 - created using Vapi trieve integration). I used API
GET https://api.vapi.ai/call/763a2965-d7bd-4284-bf5f-c9174d92c5fd to get the call data, but I do not see anywhere the chunks that were sent to OpenAI.udAIan
01/09/2025, 11:40 PMAlejo
01/10/2025, 3:19 PMAlejo
01/10/2025, 3:19 PMudAIan
01/10/2025, 9:29 PMShubham Bajaj
01/11/2025, 11:47 AMarash
01/12/2025, 6:08 PMShubham Bajaj
01/13/2025, 4:02 PMarash
01/13/2025, 4:42 PMmessagesOpenAIFormatted. Are you guys then removing the modified user message with the RAG template when you show it to the user? If you guys can provide this RAW user/assistant query with the chunks injected OR the chunks themselves that would be very helpful for our evals (this is ultimately what @udAIan is looking for)
2. When you say the "LLM decides when to invoke the Knowledge Base (KB). What is the assistant or user message you are injecting to make this determination? Or are you calling the KB provider search for every user message and if there are relevant chunks, you apply the RAG template?Shubham Bajaj
01/13/2025, 4:50 PMmessagesOpenAIFormatted. I'll check with the team about the ETA for implementing this functionality.
2. Regarding KB invocation:
The LLM first checks if relevant information exists in the system prompt. If it finds the necessary information there, it generates the response directly. If not, then the LLM decides to use the KB do we proceed with retrieving chunks and applying the RAG template.arash
01/13/2025, 4:52 PM