New gpt-3.5-turbo-1106 model constantly times out, anyone else?

My call

response = openai.ChatCompletion.create(

Problem details
I try to call it on folder containing text files to summarize them – it just times out, all the time. This issue is not present with older 3.5 turbo and I even tried GPT 4 turbo and that also worked fine.

Am I doing something wrong?

1 Like

Hi and welcome to the Developer Forum!

Where are you specifying the text files to read?

This is the file where I call the function

for filename in os.listdir(‘positions’):
with open(os.path.join(‘positions’, filename)) as f:
text =

cleaned_text = clean_up_positions(text)

with open(os.path.join('cleaned_positions', filename), 'w') as f:

This is my full function for AI call:

def clean_up_positions(jad):
# Creating a message as required by the API
messages = [
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: “Extract all the relevant info from following text related "
f"to job ad that should be present there. Follow the structure of the job add”
f" Here is text: {jad}."},

# Calling the ChatCompletion API
print("API has been called")
response = openai.ChatCompletion.create(

I’m experiencing similar issues with gpt-3.5-turbo-1106. It gets “stuck” after 4-6 calls.

However, the old gpt-3.5-turbo is working fine. So I don’t think it’s a matter of going over the usage limits.

I think those models are the same actually? I will just work with old model as the pricing seems to be the same, at least it looks like that.

How big is the text inside of {jad} ? less than the context limit?

It must be as it works with classic gpt-3.5-turbo. On average the file size was cca 9k characters when I ran the script to count it.

I’m running into the same issue as well. gpt-3-5-turbo-11-06 and gpt-4-preview seems to timeout randomly on some API calls. I used gpt-3-5-turbo-16k and have no problem.

using the non 16k model would seem to indicate that your instructions prior to the {jad} statement are lost and also some of the text at he start of the {jad} variable as well, you need to ensure that you use data that fits into the models context limits.

Consider this. GPT-3.5-turbo has same context size as GPT-3.5-turbo-1106. So, if the first one works as it should and second one is not working while they both have the same context window, it hardly can be cause of the issue, correct?

It’s totally possible that your issues are related to the load currently on the new models as they get tested by everyone, it could also be usage limits and a number of other causes.

One thing you could try is enabling streaming and then cancelling the current API call with a .close if more than… lets say 10 seconds have passed, that way you could build a retry system that was quick, keep track of your retries so you can send them to support for billing purposes.

Good points. I am currently ok with the older model. I imagine it has something to do with current load as others also mentioned this problem. Rates are no issue as I am tier 3 user.

I like your backup solution, I considered the same thing.

Thanks for all the effort assisting me, I appreciate it. This was my first time here and I am leaving with excited feeling.


gpt-3.5-turbo-1106 keeps hanging for us randomly. It’s rather annoying :sweat_smile: @Foxabilo do you know if there are efforts to increase capacity for the new models?

I should say that the new gpt-4 works without any issues. Probably because in general people still prefer cheaper models and leave gpt-4 unused?

I had the same experience just putting my first playground into the -1106 model for the day, no response. Had to press the cancel button and press again when it became submit.

The python library will retry once. I’ve still gotten other models that never talk back either, so it is not exclusive to the newly-released AI model.

The API reference advertises a new finish_reason to be wary of: “content_filter” (which originated in Azure). The responses might thus now or in the future be reliant on another moderations model call or two having success before you can get any output.

Same issue here the new models are not usable, it takes forever to process a batch. I have to revert to the legacy 16k models and pay additional legacy costs to just use the API.

1 Like

I am seeing the same thing with my gpt-3.5-turbo-1106 endpoints where I observe intermittent timeouts. About 1/10 are timing out for me but when I switch to gpt-3.5-turbo-0613 I haven’t observed the same issue. No difference in the prompts or inputs I am using with each.

Is Open AI actively looking into this? We have had better results with the latest model so we don’t want to switch back but our team is deciding if we should at least for the short term for more stability.

1 Like

The timeouts are due to system load, hence why new ChatGPT Plus memberships have been paused, this should resolve itself with additional hardware and software streamlining, swapping models not solve the overload issue, so that is a choice you need to make without being 100% sure it will resolve the current issue. I would expect this to get significantly better over the next few weeks.

1 Like

I have the exact same problem with this model. Please let us know if this is solved, in the meanwhile I will use the old model. Thanks.

I believe in most of the cases of timeouts with 3.5-1106 it is because of the sub-optimal instructions and parameters. I had that also initially, but after fixing the problems I get stable results. Here’s an example of a test a few mins ago:

LM params: {'temperature': 0, 'response_format': {'type': 'json_object'}, 'timeout': 5, 'model': 'gpt-3.5-turbo-1106'}
Generated successfully: 100.00% (110/110)
Valid responses: 90.00% (99/110)
Average generation time: 2.30

Valid responses - here I’m validating JSON against a pydantic class.