vapi agents are more prone to hallucinations Vapi AI #support

vapi agents are more prone to hallucinations

GeneralKugelBlitz

04/18/2024, 12:08 AM

I have seen this time and time again that gpt4 turbo model is relatively dumber. It is way easier to break and make it hallucinate

Vapi Support

04/18/2024, 12:08 AM

Hi @GeneralKugelBlitz, I'm an AI assistant for Vapi. While a team member reviews this question, I'll look through relevant sources and see if I can provide suggestions. Please tag @Vapi Support with any follow-up questions.

Vapi Support

04/18/2024, 12:08 AM

----------------------------- **While I generate an answer, here are some sources that might be helpful**: * [Set up your Server URL]() * [Call Functions]() * [Billing Examples]() * [Inbound Support Example]() -----------------------------

Vapi Support

04/18/2024, 12:08 AM

### Answer I understand your concerns about the performance of models, including instances of hallucinations. Vapi integrates with OpenAI's models, offering flexibility in model usage and function calling to enhance performance and reliability in various scenarios [(1)]() . For specific issues or optimization advice, it might be beneficial to explore Vapi's documentation or contact support directly for tailored assistance.

GeneralKugelBlitz

04/18/2024, 12:10 AM

the model in gpt UI is smart enough to know there are no availabilities on saturday, but gpt-4-1106-preview model just makes availabilities up

GeneralKugelBlitz

04/18/2024, 12:13 AM

using the same exact prompt on both

GeneralKugelBlitz

04/18/2024, 12:14 AM

AZURE open ai is just dumber

GeneralKugelBlitz

04/18/2024, 12:21 AM

" AI: Great. Let's look at Tuesday. For Tuesday, April 20 third, we have a wide range of times available. We're open from 9 8 to 4 acid. Have a preferred time in the morning or afternoon? User: 9 what to for what? AI: My apologies for the confusion. On Tuesday, April 20 third, we have openings from 9 at to 4 Aje. You prefer a morning or an afternoon appointment? User: Uh, afternoon? " there is hallucination and then there is this.

GeneralKugelBlitz

04/18/2024, 12:22 AM

"from 9 8 to 4 acid" it seems like its on acid

GeneralKugelBlitz

04/18/2024, 12:23 AM

@User GPT 4 would be too expensive, could you suggest any model that doesn't trip like this and is reliable?

Marco

04/18/2024, 12:38 AM

I've had good results with Antrhopic, if you get the prompting right, together with the low latency, its a great provider, plus haiku and sonnet are pretty cheap.

Marco

04/18/2024, 12:39 AM

have you given any thoughts on this?

GeneralKugelBlitz

04/18/2024, 12:40 AM

thanks

GeneralKugelBlitz

04/18/2024, 12:40 AM

I was thinking the same

Marco

04/18/2024, 12:41 AM

yeah man, give it a try. But its interesting how long the sentences are, so I would recommend you to keep the temperature and the tokes very low

jordan

04/18/2024, 12:42 AM

Yeah I'd say the same as above ^ opus or haiku

GeneralKugelBlitz

04/18/2024, 12:43 AM

opus is expensive and slow

GeneralKugelBlitz

04/18/2024, 12:43 AM

what about sonnet?

GeneralKugelBlitz

04/18/2024, 12:44 AM

and why haiku, isn't it the smallest model?

GeneralKugelBlitz

04/18/2024, 12:44 AM

and do function calls work as well with all of these?

GeneralKugelBlitz

04/18/2024, 12:45 AM

I will test all of them but would appreciate some pointers from people who have already tested

Marco

04/18/2024, 12:56 AM

well for me opus is out of the equation. But I feel like the difference between the other two models regarding performance is subtle but perceivable. But the reason I chose haiku is the latency, which makes up for the lower processing capabilities. Regarding function calling, I haven't noticed much difference, but you def have to get the prompting right

Marco

04/18/2024, 12:58 AM

from what you have shown, i believe you should make the instructions shorter. The descriptions for the functions on the functions panel already do a lot by itself

GeneralKugelBlitz

04/18/2024, 1:05 AM

I don't follow which instructions need to be shorter?

GeneralKugelBlitz

04/18/2024, 1:07 AM

if you are talking about system prompt, when you give it general instructions it was always messing it up cuz its too dumb. so I had to give it specific example for each scenario, my prompt is like 1400 tokens

GeneralKugelBlitz

04/18/2024, 1:08 AM

no matter how many times I say don't just dump all the slots as a list, it would still do it

GeneralKugelBlitz

04/18/2024, 1:09 AM

its prone to print out information then explain and summarize it

GeneralKugelBlitz

04/18/2024, 1:09 AM

so examples are needed to change its behaviour

GeneralKugelBlitz

04/18/2024, 1:24 AM

8888888888888888888888888888888888888888

GeneralKugelBlitz

04/18/2024, 1:24 AM

User: Hey, Melissa. How's it going? AI: Speaking in a friendly professional tone, hello? This is Melissa from California Dental. I'm doing well. Thank you for asking. How can I assist you today? Do you have a dental concern or would you like to schedule an appointment. User: Right. What? No. The first thing is tell me how are you doing. AI: Chuckles warmly, oh, I'm doing splendidly. Thank you for asking. As an AI assistant, I don't have personal feelings, but I'm here and ready to help you with anything related to scheduling a dental appointment. How can I assist you today? User: Yeah. Well, I have a tooth issue. Like, uh, pain? haiku is all sorts of wrong

GeneralKugelBlitz

04/18/2024, 1:40 AM

also didn't call the function and hallucinated that it did

jordan

04/18/2024, 4:08 AM

Hm yeah one more thing you can try is specifying a specific model in the model field. 0125 models are less prone to function hallucinations

Sahil

04/18/2024, 8:02 AM

You can try other models, but make sure to follow the LLM model prompting guide because you can see a significant difference due to it.

GeneralKugelBlitz

04/18/2024, 8:58 AM

GeneralKugelBlitz

04/18/2024, 8:59 AM

I know there is a bunch which one would be best for less hallucinatioon and good function calling?

8 Views

Previous Next