The stable model is not switched yet. As I replied four days ago:
gpt-3.5-turbo as an alias has never left being directed to gpt-3.5-turbo-0613 as the stable model. The schedule for re-pointing to -0125 is “two weeks after release” according to the blog. The API return will tell you what model you are getting the response from. The error will tell you what input context length you exceeded. openai.BadRequestError: Error code: 400 - {‘error’: {‘message’: “This model’s maximum context length is 16385 tokens. However, your messages resulted in 18565 tokens …
call to gpt-3.5-turbo just now:
{‘id’: ‘chatcmpl-xxx’, ‘choices’: [{‘finish_reason’: ‘stop’, ‘index’: 0, ‘logprobs’: None, ‘message’: {‘content’: ‘Hello! How can I assist you today?’, ‘role’: ‘assistant’, ‘function_call’: None, ‘tool_calls’: None}}], ‘created’: 1707289999, ‘model’: ‘gpt-3.5-turbo-0613’, ‘object’: ‘chat.completion’, ‘system_fingerprint’: None, ‘usage’: {‘completion_tokens’: 9, ‘prompt_tokens’: 8, ‘total_tokens’: 17}}
The reason why 3.5 is not as good as gpt-4, which costs 10x as much? I’m guessing that one could answer this on their own. You can significantly decrease the API temperature parameter for less common languages where the AI is less certain and test.