Thank you for getting back to me! I’m wondering about the versioning of the GPT-4 model. Does it have multiple versions like 1.1.X, and are there updates that occur without notification?
For instance, GPT-4 on the OpenAI chat websitewas initially released in March 2023, but it seems to be updated regularly.
According to their release notes, there was an update on Aug 3.
Based on developer feedback, we are extending support for gpt-3.5-turbo-0301 and gpt-4-0314 models in the OpenAI API until at least June 13, 2024. We’ve updated our June 13 blog post with more details.
With the release of gpt-3.5-turbo, some of our models are now being continually updated. gpt-3.5-turbo, gpt-4, and gpt-4-32k point to the latest model version. You can verify this by looking at the response object after sending a ChatCompletion request. The response will include the specific model version used (e.g. gpt-3.5-turbo-0613).
We also offer static model versions that developers can continue using for at least three months after an updated model has been introduced. With the new cadence of model updates, we are also giving people the ability to contribute evals to help us improve the model for different use cases. If you are interested, check out the OpenAI Evals repository.
The following models are the temporary snapshots, we will announce their deprecation dates once updated versions are available. If you want to use the latest model version, use the standard model names like gpt-4 or gpt-3.5-turbo.
OpenAI has an unwritten/unwanted policy of continuing to update the current model and its tuning behavior beyond the shown date as reactionary response to undesired demonstrations.
This thread got a bit weird, but the answer to your question is that if you specify “gpt-4”, then you’ll call whatever is considered the latest version of gpt-4, which is currently “gpt-4-0613”. You can call the previous model by using the previous 4-digit month/day release date (0314).
If you have some deployed code in production that is set to call latest, you need to be aware that your code will roll over to whatever the newest model releases are as they occur. If you specify a specific build, your code might shut off on the official deprecation date for whatever model you’re calling.
I’ve been trying to access the gpt-4-32k endpoint for a few days now and still get the “model does not exist” response. I’ve called the OpenAI API for a list of available models and it’s not in there.
It’s frustrating to build a dev pipeline around that context window because the OpenAI documentation says it’s an option, only to go back and change it all because it’s not actually an option.
Is there a specific flag or something I’m missing, or are others having the same issue?
Yeah, I’ve figured that out. Thank you for the reply. That’s a little frustrating, especially considering I’d engineered around that context window after viewing the API docs.
It is what it is; I’ve adjusted.
Does anyone have tips on how to actually gain access to it? What do I need to do? I’m having a hard time even getting my API spending limits increased; I was going to request a rate limit increase, too. I just figured I’d take it one step at a time. It’s crazy how long it takes to get a response that isn’t a boilerplate GPT-created message.
They’ve used like 25,000 A100s and 3100 HGX servers to make this thing work… and we can’t get emails back regarding our issues. I could have built an automated response system that used the API to at least provide a queue, some non-generic response, etc. in a week. LOL
You can verify this by looking at the response object after sending a request. The response will include the specific model version used (e.g. gpt-3.5-turbo-0613).
Also, when you deploy, you can make a custom deployment name to refer to that deployment as model in API calls.
Finally, you can distinguish models by max_tokens. Set max_tokens above 4100 to then be denied on gpt-4-turbo models (or huge but below your rate limit for denial everywhere). An error message gives more details.