Any Suggestions to Reduce Cost and Limit Message Length of GPT 4 Turbo Model in Assistan API?

Hello everyone.

I am working on a project. In my project, I am trying to answer the questions of our students using an assistant api. I am providing the information with json file using retrieval tool.
When I use the gpt 3.5 model for the assistant, the chatbot does not give very consistent answers to the questions. But when I use the gpt 4 turbo model, I can get decent and consistent answers.

But in the pricing policy, the gpt4 turbo model costs 20 times more than the gpt 3.5 model. For this reason, I try to reduce the number of tokens used by the assistant using the gpt4 turbo model. For this purpose, the assistant ignores my commands such as “do not write answers longer than 100 words” in the instructions section.

As a result of my research, I read that there is no message length restriction feature in the asisstant api.

Do you have any suggestions for the gpt 4 turbo model to charge less? Or if there is an OpenAI official or CEO reading this message, can prices be reduced?

1 Like

Try making it to where it’s answers are to be consistent.

Let me know if this does not work.