Is /v1/chat/completions endpoint a legacy endpoint

Hi OpenAI Community,

I am currently working on a project where I am integrating GPT-3.5 models into my application. However, I have encountered an issue where only gpt-3.5-turbo-instruct seems to be working with the /v1/chat/completions endpoint. When I try to use other GPT-3.5 and gpt-4 models, I receive the following error:
Error code: 404 - {‘error’: {‘message’: ‘This is a chat model and not supported in the v1/completions endpoint. Did you mean to use v1/chat/completions?’, ‘type’: ‘invalid_request_error’, ‘param’: ‘model’, ‘code’: None}}

Does this mean that the /v1/completions endpoint is deprecated for other GPT-3.5 and 4 models? Or am I missing something in my setup?

For context, I have already added $5 to my OpenAI account, so I believe I should have access to the models. would appreciate any guidance on whether this issue is due to endpoint deprecation or if there’s a different recommended approach to using other GPT-3.5 and 4 models with the /v1/chat/completions endpoint.

1 Like

gpt-3.5-turbo-instruct should only work with v1/completions, the other gpt-3.5 models and all gpt-4 models use v1/chat/completions.

I am assuming you are correctly calling the v1/completions endpoint with gpt-3.5-turbo-instruct and you neglected to change the endpoint to v1/chat/completions when you switched models, as the error you are getting is the exact error you would expect to receive if you called v1/completions with a non-instruct GPT-3.5 model or a GPT-4 model.

The solution will be to change the endpoint tov1/chat/completions and update your API call structure to use the chat format and you’ll be back in business.


1 Like

The chat models were deployed only to be used on their own endpoint in March 2023.

They are trained to use the messages format of that chat endpoint and assign priority and trust based on the role of the message sender.

The completions endpoint indeed has had its models punished and rate-limited. gpt-3.5-turbo-instruct for example is now reduced to 90k TPM vs a 2000k TPM for chat’s gpt-3.5-turbo at tier 5. That prevents wide deployment to a user base.

babbage-002 and davinci-002 are base completion models that can be fine tuned, but they are obviously 1/10th of the original davinci gpt-3 model.

They can do what you want in unique ways, even revealing what is powering the thoughts of AI. It is perhaps unliked because you can have the AI produce in any context you want. davinci-002, multi-shot: