Can Vapi store the LLM output instead of the trans...
# support
m
When using a custom model, and inspecting the object the Vapi sends to my endpoint, I notice that the messages object contains the transcriber's output, not the LLM output. For example, if the LLM generated "One Two Three", the messages object will contain { "role": "assistant", "content": "1 2 3" }
m
Hey again bro, in the advanced settings of your agent there's server messages, check the model output box and you should see now exactly what the model output
same w client messages
m
Hey! thanks. Yeah we've done this already and what we see is the transcriber output. The example noted above is from inspecting those messages. Wanted to check if there's anyway we can configure Vapi to store the LLM output instead.
The concrete usecase we have is improving number pronunciation. We found that when the LLM generates numbers as words, they are pronounced better by the voice model. So, we want the LLM to always produce numbers in words (e.g. One Two Three, not 1 2 3). And we have some instructions for that in the system prompt. But since Vapi stores the transcriber output in the history. What we see in the history is "1 2 3", and as the conversation goes on, the effect of the digits in the history overrides our system prompt, and the LLM starts producing digits in the output. This is also relevant to [this question](https://discord.com/channels/1211482211119796234/1345118993937076347) If we add this suffix to the user message "Always respond in words", the LLM follows the instruction despite the history containing digits not words. But again, we'd like to implement this without a custom model due to latency. And I think if we can get the conversation history to include the LLM output, the effect of the system prompt will retain throughout the conversation.
m
You have a great usecase for just fine tuning an openai model to produce this behavior natively. Would solve both of your problems. Have you tried that? -- What you're talking about is a product of your transcriber transcribing it as "1,2,3", those messages adding up in the user role messages of the thread and then that affecting the assistant roles output in the thread yeah?
Storing the LLM output instead of the transcriber output isn't the fix because the root issue is your user messages from the transcriber causing your assistant to not follow it's system prompt. The root of it is: { role: 'user', content: '1 2 3' }, which is filled by the transcriber so your two options are to find a transcriber that outputs 1,2,3 as one two three or fine tune a model so it no matter what the transcriber outputs that the model never strays from outputting { role: 'assistant', content: 'One, Two, Three' }
You get me?
m
Fine tuning is on our road map, but we are still in the data collection process as that will require some data. Yes this is the root cause as you highlighted:
Copy code
The root of it is: { role: 'user', content: '1 2 3' }, which is filled by the transcriber
And that is why I don't want to fill the conversation history with the transcriber's output, but with the LLM output. But it seems like this is not something I can configure Vapi to do. But I will try looking for a transcriber that produces words instead of digits. Thanks!
m
the "conversation history" the model sees is the assistant thread, meaning the aggregation of user messages and assistant messages, you'll never be able to fill the user messages with the LLMs output because that will always go into the assistant messages the root here is definitely the transcriber. Even with 50 examples and 10 minutes you could fix the behavior by fine tuning it would be super quick. Let me know if you need help with anything.
im going to make the dataset for you give me 10 minutes
@Mohab
That will 100% fix this issue
m
Will give it a try, thanks!
@Mason | Building KOI It is possible in deepgram to produce numbers as text by setting the [numerals toggle to False](https://developers.deepgram.com/docs/numerals). Is this something I can customize in Vapi?
m
No unfortunately we don't have control over the api calls between providers and we trade it off for their low latency improvements and infra. If that was the case you also could use deepgrams find and replace to solve it as well. 100% though Vapi wants to give us more options when it comes to the provider api calls but it just takes time
m
Are you referring to [custom keywords](https://docs.vapi.ai/customization/custom-keywords) ? Or how exactly would I use find and replace with Vapi
m
no no you can't that's what I'm saying it's a deepgram feature but we don't have control of it through vapi yet: https://developers.deepgram.com/docs/find-and-replace
ah wait
lightbulb
use deepgram as your custom transcriber and then you can set whatever find and replace or numeral toggle you want
it's a lot more setup than fine tuning though
m
ok thanks! will take a look
k
Hey @Mohab checking if this is resolved for you?
m
yes, thank you.
k
Marking this ticket as Solved ✅