Pigeon
03/25/2025, 7:30 AMVapi Support Bot
03/25/2025, 7:31 AMtriggerResponseEnabled
Flag
You can use the triggerResponseEnabled
flag to control whether a message triggers an LLM response:
javascript
vapi.send({
type: "add-message",
message: {
role: "assistant",
content: "One moment, I'm thinking...",
},
triggerResponseEnabled: false // This prevents waiting for user input
});
// Then after your agentic processing is complete:
vapi.send({
type: "add-message",
message: {
role: "assistant",
content: "Here's your answer...",
}
});
### 2. Using the .send()
Method
During a call, you can send intermediate messages using the .send()
method:
javascript
vapi.send({
type: "add-message",
message: {
role: "system",
content: "The assistant should say 'One moment' and then continue without waiting for user input",
},
});
### 3. Sending Silent System Messages Server-Side
According to [information I found](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1339634926248525935), you can "send silent system messages server side" using a "control url" to inject messages from the backend.
### 4. Using Webhooks as a Workaround
Some users have reported success [using webhooks instead of the Vapi module](https://discord.com/channels/1211482211119796234/1211483291191083018/threads/1334661059298066514) to send multiple messages.
### 5. Building Workflows with Multiple "Say" Tasks
If you're using Vapi's workflow builder, you can add multiple "say" tasks to send multiple messages in sequence according to the [Vapi workflows documentation](https://docs.vapi.ai/workflows).
### 6. Adjusting Conversation Flow in Your ScriptVapi Support Bot
03/25/2025, 7:31 AMPigeon
03/25/2025, 9:56 AMPigeon
03/26/2025, 6:37 AMShubham Bajaj
03/26/2025, 6:55 AMKings Big
03/26/2025, 10:53 PMShubham Bajaj
03/26/2025, 10:53 PMPigeon
03/27/2025, 7:35 AMcontent_only = f"data:{json.dumps(first_chunk.model_dump())}\n\n"
Which doesn't have any end markers.
Then the LLM workflow generated other message with stop
and [DONE]
Result: After initial vapi message long wait as before and "one minute" + LLM message at the same time. So no "One moment" straight away and and LLM message after few seconds. Everything is ttsed after few seconds. Probably when [DONE] and stop is sent
If I send custom message like:
content_only = f"data: {json.dumps(first_chunk.model_dump())}\n\ndata: [DONE]\n\n"
Which only sends [DONE]
but not stop
Nothing initially, after LLM runs, I only hear "One moment"
And if I send custom message full stop
and [DONE]
I get "One moment" only after LLM workflow as well.
I don't think VAPI start TTS without [END]
and/or stop
?
Can you confirm, once VAPI starts getting SSE stream, it sends received chunks straight to tts or does it only send after receiving [DONE]
and/or stop
Maybe im sending "one moment" in a wrong way?Kings Big
03/30/2025, 11:30 AMKings Big
03/30/2025, 11:30 AMPigeon
03/31/2025, 6:23 AM"One moment please."
chunk before running LLM workflow that sends it's own chunk after, no interference between I think":
CallID`71b7e0d2-6fb8-4380-b20c-b43b7e9637d8`
In my terminal I see:
-----------Message chunk:
data: {"id": "x", "object": "chat.completion.chunk", "created": x, "model": "llama-3.3-70b-versatile", "choices": [{"index": 0, "delta": {"role": "assistant", "content": "One moment please."}, "finish_reason": null}]}
From code:
async for chunk in self._send_message(
"One moment please.",
completion_id,
vapi_payload,
send_stop=False,
send_end_only=False,
):
print("-----------Message chunk:")
print(chunk)
yield chunk
So it created:
first_chunk = ChatCompletionChunk(
id=completion_id,
model=model_name,
choices=[Choice(delta=DeltaContent(role="assistant", content=message))],
)
And is sent/streamed like this:
content_only = f"data: {json.dumps(first_chunk.model_dump())}\n\n"
yield content_only
So no [DONE]
No final chunk with stop
Al of this is called before running LLM workflow that takes 3-5 seconds.
Then LLM workflow generates response and it is sent like this:
final_chunk = ChatCompletionChunk(
id=completion_id,
model=model_name,
choices=[Choice(delta=DeltaContent(), finish_reason="stop")],
)
combined = f"data: {json.dumps(first_chunk.model_dump())}\n\ndata: {json.dumps(final_chunk.model_dump())}\n\ndata: [DONE]\n\n"
yield combined
So LLM response is sent with final chunk that has [DONE]
and finish_reason="stop"
Result: I hear "One moment please."
AFTER LLM response is sent together with LLM response.
I'm could be doing something wrong, It would be amazing if this worked, I'd have everything I need to build AI agents for huge logistics company with VAPIPigeon
03/31/2025, 7:18 AMPigeon
03/31/2025, 7:19 AM[INTERUPT]
marker or something in that caseKings Big
03/31/2025, 9:22 PMHello, I'm your virtual assistant.<flush />
Let me check that information for you.<flush />
I've found what you're looking for.
This will cause "Hello, I'm your virtual assistant." to be immediately sent to TTS, followed by "Let me check that information for you." as soon as it's generated, without waiting for the rest of the response.
When VAPI detects this tag, it immediately sends all text before the tag to TTS.
Give it a try and let me know how it goes for you.Pigeon
04/01/2025, 5:56 AMKings Big
04/01/2025, 8:37 PM