GPT3.5 with chain of thought vs. GPT4

What would you estimate is better for tasks that require reasoning - GPT3.5 with chain of thought or GPT4?
As I have a limited budget in terms of time until user response, I would need to choose between the two.

I would recommend testing the results in the playground to see which one you like better for your use case.



Can you tell us a bit more about your use case?
What type of conversations are you expecting?

For example sentiment analysis can be done with 3.5 quite well while argumentation is more the area for the newer model.

My use case involves code creation in a domain specific coding language

Time critical code creation sounds interesting.
But unless the language is part of the training set and the questions are relatively easy the newer model will likely be better.

But there is a catch.
With these lesser known languages it often happens that the model produces the same type of error over and over again. Usually this would be accounted for via prompting after you identified the error cases. If the replies will then still be fast enough for you needs to be tested.