Margin for openAI tokens Vapi AI #support

Margin for openAI tokens

piechos

04/15/2024, 9:58 PM

Hey do you add any margin for openAI tokens? I thought I heard somewhere in your marketing materials that you don't but I see that I've been charged much more for LLM than what open AI charges for GPT3.5

Vapi Support

04/15/2024, 9:58 PM

Hi @piechos, I'm an AI assistant for Vapi. While a team member reviews this question, I'll look through relevant sources and see if I can provide suggestions. Please tag @Vapi Support with any follow-up questions.

Vapi Support

04/15/2024, 9:58 PM

----------------------------- **While I generate an answer, here are some sources that might be helpful**: * [Dashboard]() * [Quickstart]() * [Pricing]() * [Custom LLMs]() -----------------------------

Vapi Support

04/15/2024, 9:58 PM

### Answer Vapi bills you for the Transcription and Voice services at cost. If you choose to go with a LLM we offer through the platform, you will be billed for the Model at cost. Additionally, Vapi charges a base fee of $0.05 per minute on top of what you are billed for providers [(1)]() . If you've noticed a discrepancy in pricing, it might be worth checking the detailed breakdown of charges on your Vapi dashboard or contacting support for clarification.

Vapi Support

04/15/2024, 10:10 PM

Got it. Tagging @User for additional help.

piechos

04/15/2024, 10:10 PM

$2.71 seems too much for the amount of tokens used. gpt3.5-turbo charges less

Sangy

04/15/2024, 10:16 PM

600k prompt tokens? That's too long! which llm is this?

piechos

04/15/2024, 10:21 PM

Gpt3.5 turbo. What do you mean too long? Even with 600k is too much. OpenAI charges $0.5 per million prompt tokens

Sangy

04/15/2024, 10:32 PM

https://yourgpt.ai/tools/openai-and-other-llm-api-pricing-calculator this calculator should help double check. i think it adds up for this amount of token usage.

Sangy

04/15/2024, 10:35 PM

actually I'm not sure, better have the @User check this..

piechos

04/15/2024, 10:36 PM

Thanks. Calculator confirms my calculations. I should be charged 0.304 instead of 2.71

nikhil

04/15/2024, 11:22 PM

mind sending me a callid. we likely haven't updated the values

piechos

04/16/2024, 8:26 AM

fa801d80-25ab-406f-8151-4807164f0405

piechos

04/16/2024, 8:27 AM

Which values?

piechos

04/16/2024, 2:17 PM

Now I inserted my openAI API keys and you don't charge me for LLM but openAI also doesn't charge me so you probably still pay for tokens. Seems like you have some issue here

nikhil

04/16/2024, 6:15 PM

yup that issue i did fix recently on staging-api.vapi.ai and will be released to prod soon. for now, enjoy the free tokens:)

nikhil

04/16/2024, 6:16 PM

the bug regarding us mischarging you, let me take a look

piechos

04/16/2024, 10:07 PM

Ok, thanks

piechos

04/19/2024, 2:31 PM

Today openAI started charging me for GPT-4-0125-preview when using voice agent. But in my agent I've chosen GPT-3.5. Please check this. My assistant id is 76d28212-ea18-4705-9140-edd6ff9bd953

piechos

04/19/2024, 3:18 PM

And now with another use I’ve been charged for gpt3.5 and then with another use for gpt4 again. The number of tokens I’ve been charged is smaller than showed in vapi dashboard. Really weird things are happening with your billing

piechos

04/19/2024, 3:20 PM

Please fix it because when I get charged with gpt4 price is 20 times more expensive which is a big problem

Sahil

04/19/2024, 3:43 PM

@piechos Could you please provide the call_id for the call where you were charged a high amount?

piechos

04/19/2024, 4:55 PM

If you are talking about the previous issue where I was charged for LLM by vapi it's this one - fa801d80-25ab-406f-8151-4807164f0405

piechos

04/19/2024, 4:57 PM

I fyou are talking about todays issue where I've been charged for GPT4 by openAI when using GPT3.5 it's one of these: f22a49f1-7de0-4cc9-9f17-a9624bcbc776 e6347236-1d22-40eb-b30d-e417495912b8

Sahil

04/19/2024, 4:57 PM

...this one

piechos

04/19/2024, 4:58 PM

The problem is that openAI charges me for gpt4 (and ones for gpt3.5) but agent had choosen gpt3.5 the whole time

piechos

04/19/2024, 8:08 PM

@Sahil did you fin the issue?

piechos

04/20/2024, 9:50 AM

Today another call to openAI API (GPT4) was issued using my API Keys that I created for VAPI. Why? Sometimes VAPI uses GPT4, sometimes GPT3.5, and sometimes doesn't call any model (I don't get charged by openAI API). The chosen model is GPT3.5-Turbo and I haven't touched this setting for a few days. Do you at least know what's going on there?

Sahil

04/20/2024, 10:53 AM

Hey Piechos, I will provide you the answer by tonight. If anything went wrong with the pricing we will refund you. Don’t worry about it.

Sangy

04/20/2024, 2:25 PM

And if this is happening to him, should be happening to everyone? 😅

piechos

04/20/2024, 4:51 PM

I don't worry about you refunding me. I just wanna make sure that you solve it so I can start fully using my agent reliably without being afraid of being charged 20x more for LLM

nikhil

04/20/2024, 7:18 PM

Could you DM me screenshot of OpenAI charging you for GPT4? I don't see GPT4 getting used anywhere in the call logs for the ids you shared

nikhil

04/20/2024, 7:23 PM

It might be easier to hop on a quick call, DM'ing you

piechos

04/24/2024, 12:14 PM

Today it happened again. I've been charged for GPT4 for this call... fa49e72a-5591-44cd-bb6c-2011a5ba16be

piechos

04/25/2024, 9:45 AM

So I removed my openAI API keys to check how VAPI charges (because open AI kept me charging for GPT4 even though I used GPT3.5 in my assistant) and I see this. @nikhil could you please explain the calculation behind the LLM cost? If you are using gpt-3.5-turbo-0125 then it should be (162647 / 1000000 * 0.5) + (1359/1000000 * 1.5) = 0.083362 If you are using gpt-3.5-turbo-instruct then it should be (162647 / 1000000 * 1.5) + (1359/1000000 * 2) = 0.2466885 But you charged me $0.83. How did you come up with this number? I know these amounts are small and you say that you will reimburse me but this issue is blocking me from scaling it up and from using it in production. It's been 10 days since I reported it and nothing changed Please help Call id - 009b27bf-dd28-45bb-a80a-f0e1ab31e290

nikhil

04/26/2024, 12:01 AM

Taking another look. Gimme 10

nikhil

04/26/2024, 12:59 AM

(Still Checking)

nikhil

04/26/2024, 4:01 AM

Found the "bug":

Copy code

🔵 08:27:51:795 Getting Sequence For `gpt-3.5-turbo` (26774 tokens, functions: true, hipaa: false)...
🔵 08:27:51:798 Selected Models: gpt-4-0125-preview, gpt-4-turbo, gpt-4-turbo-preview, gpt-4-turbo-2024-04-09, gpt-4-0125-eastus1, gpt-4-1106-preview...

when your token length exceeds the context window of your requested model, we auto magically find a model that works. this is why with your key, you were seeing gpt-4 requests too. we could give up and end the call but not sure if that's any better. for now, i'll just leave this as is. on your end, try either less contextor or use haiku

Abed Malak

04/26/2024, 4:29 AM

Why isn't it the case that only that last 5 messages get rememebered each time

Abed Malak

04/26/2024, 4:29 AM

To make the context smaller

piechos

04/27/2024, 5:06 PM

So for the context, you are using the whole history of the conversation?

piechos

04/27/2024, 5:08 PM

If you are switching models and using a different one than chosen, you should at least indicate this somehow in the calls history. Otherwise, how could I know that the context was exceeded and I should amend my assistant? I don't want to use GPT4 for sure. It's 20 times more expensive than GPT3.5 and it's cost adds up pretty quickly

piechos

04/27/2024, 5:18 PM

I tried with Haiku but the assistant hangs whenever I tell it to call a function. Call ids 95742ff0-6897-475e-8f32-46b95b1cdf61 bd028ccf-07bd-49d6-9a19-6a419beafcbf

piechos

05/01/2024, 9:00 PM

@nikhil are you going to do anything about this? I can't use the agent if I can't control the model and the costs. How many last messages do you use for context?

Vapi Support

05/06/2024, 7:58 PM

Got it. Tagging @User for additional help.

Mason | Building KOI

05/06/2024, 8:11 PM

In the API can't you chose your own fallback model

piechos

05/06/2024, 8:30 PM

Unfortunately, there is no such option. You can't choose fallback model and you don't even see how many tokens were processed by main vs fallback model

Mason | Building KOI

05/06/2024, 8:31 PM

There is

Mason | Building KOI

05/06/2024, 8:32 PM

model.fallbackModels enum[]

Mason | Building KOI

05/06/2024, 8:32 PM

**"fallbackModels": [ "gpt-4-0125-preview", "gpt-4-0613"**

piechos

05/06/2024, 8:33 PM

I'm building assistant via web UI and there is no such option

Mason | Building KOI

05/06/2024, 8:33 PM

Gotcha well there is on the API

piechos

05/06/2024, 8:34 PM

Thanks, good to know that, but I still would need it on the UI. Using API wouldn't be very convenient in my case

Mason | Building KOI

05/06/2024, 8:34 PM

Why do you make a post to this: https://docs.vapi.ai/api-reference/assistants/create-assistant

Mason | Building KOI

05/06/2024, 8:34 PM

and then edit it further in the UI

piechos

05/06/2024, 8:35 PM

Yea that could work but workarounds like this are far from good user experience with the tool don't you agree?

piechos

05/06/2024, 8:36 PM

But this is good idea, I will probably do it while waiting for this issue to be addressed

Mason | Building KOI

05/06/2024, 8:36 PM

Not really, I think it's targeted for developers so most of us use the API. If they add too much to the ui of the webapp it will be way too much

piechos

05/06/2024, 8:37 PM

If it was like that then they wouldn't create web app at all

piechos

05/06/2024, 8:38 PM

Anyway, thanks for your help. Hopefully @nikhil responds soon

Mason | Building KOI

05/06/2024, 8:50 PM

https://cdn.discordapp.com/attachments/1229551449206292501/1237144446563389571/Screenshot_2024-05-06_155005.png?ex=663a940d&is=6639428d&hm=b3a40ddcd063abf860fe8870130df1e074b7466709fb8ef0fc8122f4e27896ff&

Mason | Building KOI

05/06/2024, 8:50 PM

no problem boss

4 Views

Previous Next