Vapi is the Voice AI platform for developers.

Vapi AI

Hi,
I am having assistants connected in Squads and I am using GPT-4o as the model for all the assistants, What is the maximum limit on the number of words or tokens per assistant before it starts hallucinating and giving poor response?

For now I am having ~3200 tokens (OpenAI tokenizer), will this be in the safe limits?

<@865531204849500190>

<@1014107027099885602>
Hello!

The maximum number of tokens that an assistant model can handle effectively depends on the specific model's capabilities. For GPT-4, the model you're using, the safe limit is generally considered to be within the model's context window. GPT-4 has a context window of 4096 tokens, which includes both the prompt and the response.

If you're using approximately 3200 tokens (as per the OpenAI tokenizer), you are within the safe limits of the model's context window. This should allow the assistant to provide coherent and contextually accurate responses without hallucinating or giving poor responses.

However, it's important to note that the quality of the response can also depend on the complexity of the conversation and the nature of the tasks being performed. It's always a good practice to monitor the performance and adjust the token limit if you notice any decline in response quality.

If you need to set or check the maximum token limit for your assistants, please provide the desired maximum token count, and I can assist you with that.

You can continue this conversation by mentioning me in the message.

Hi <@865531204849500190> ,
Please confirm this once.
Thanks.

There is no set boundary for model performance as it matters what the instructions are as well. 3200 tokens should be fine but that is a lot for a system prompt, you should look into fine tuning at that point.

The best thing for you to do is just to test your own agent by talking to it to see how it does, but there are no set boundaries. for 4o it's context window is 128,000 tokens

Thanks <@1120810893681377390> for helping <@1014107027099885602>

<@1014107027099885602> I hope your question was resolved in office hours?

Yes <@865531204849500190> 
Thanks for clearing it up 🙂