@08anas
Hello!
The maximum number of tokens that an assistant model can handle effectively depends on the specific model's capabilities. For GPT-4, the model you're using, the safe limit is generally considered to be within the model's context window. GPT-4 has a context window of 4096 tokens, which includes both the prompt and the response.
If you're using approximately 3200 tokens (as per the OpenAI tokenizer), you are within the safe limits of the model's context window. This should allow the assistant to provide coherent and contextually accurate responses without hallucinating or giving poor responses.
However, it's important to note that the quality of the response can also depend on the complexity of the conversation and the nature of the tasks being performed. It's always a good practice to monitor the performance and adjust the token limit if you notice any decline in response quality.
If you need to set or check the maximum token limit for your assistants, please provide the desired maximum token count, and I can assist you with that.