Client-side, is it possible to get the full model ...
# support
b
I'm using a custom LLM from a client-side assistant. The LLM response includes some extra data like a url or userId. Ideally I'd like to show that link to the user. The client-side
model-output
is designed to stream just the spoken content. Is there any way to get the full llm output client-side? Or other ideas on how to get data from the custom-llm back to the client.
Copy code
// Client-side
const vapi = new Vapi("...");
vapi.start({
  ...
  clientMessages: ['model-output'],
  model: { provider: 'custom-llm', ... },
);
vapi.on("message", (message) => {
  if (message.type === 'model-output') {
    console.log(message); // ?
  }
});
Copy code
// Server-side (Custom LLM)
app.post('/agent/vapi-custom-llm/chat/completions', async (req, res, next) => { 
  res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""},"index":0}]}\n\n`);
  res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{"content":"Here is the link."},"index":0}]}\n\n`);
  // additional url param added here
  res.write(`data: {"url":"https://example.com/abc","id":"1","object":"chat.completion.chunk","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}\n\n`);
  res.end(`data: [DONE]\n\n`);
});
v
Use
clientMessages
with tool calls to send extra data to the client:
Copy code
javascript
// Client-side
const vapi = new Vapi("...");
vapi.start({
  clientMessages: ['model-output', 'tool-calls'],
  model: { provider: 'custom-llm', ... },
});
vapi.on("message", (message) => {
  if (message.type === 'tool-calls') {
    console.log(message.data.url); // Access your extra data
  }
});
On your server-side, format the response as a tool call with your extra data in the arguments. Source: - [Discord discussion on client-side events](https://discord.com/channels/1211482211119796234/1330803502615625738)
v
Message marked as helpful by @bendytree! 🎉
b
If the LLM does a tool call, then i'd need a tool_call endpoint to echo the spoken text back to the LLM (as the tool result), then the LLM echos that back to Vapi. Probably don't want to add that overhead/delay. Ends up, an LLM can theoretically do a tool call AND a text completion at the same time. But, OpenAI only supports it if streaming is disabled. With a custom LLM, streaming both the text and tool call looks like this:
Copy code
res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant","content":""},"index":0}]}\n\n`);
    res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{"content":"Here is the link."},"index":0}]}\n\n`);
    res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{"tool_calls":[{"id":"2134","type":"function","index":0,"function":{"name":"parse_data","arguments":${JSON.stringify(JSON.stringify({url: "foobar"}))}}}]},"index":0}]}\n\n`);
    res.write(`data: {"id":"1","object":"chat.completion.chunk","choices":[{"delta":{},"index":0,"finish_reason":"stop"}]}\n\n`);
    res.end(`data: [DONE]\n\n`);
If the tool called is registered on the vapi assistant, then the client event
tool-call
is triggered. Unfortunately this creates a rapid loop of the agent responding over the top of itself. Even if the LLM returns an empty string it still cuts itself off. So the only option is to not register the tool call in your assistant, and actually that works much better. So the LLM streams the text and the tool call. Client-side, you will not get a tool call event, but... instead it comes in as an additional
model-output
message. So I get these two messages client-side:
Copy code
model-output {"type":"model-output","output":"Here is the link."}
model-output {"type":"model-output","output":[{"id":"2134","type":"function","function":{"name":"parse_data","arguments":"{\"url\":\"foobar\"}"},"isPrecededByText":true}]}
v
checking if this is resolved/solved for you?
b
It is, thank you Shubham
v
Marking this ticket as Solved ✅
9 Views