Few-shot chat.completion takes time too much

I’m trying to do a few shot classification task, the prompt token is about 1400, and I want to classify a total of 300 input data.
However, it seems to take approximately 8 hours, and I don’t know why it takes so long.
ex) 4 data finished within 10sec. after 10 min 5 data finished within 10sec.
Intermittent delays occur.

input data: labelled dataset(.csv)
output data: .txt

python code below

for index, row in input.iterrows():
text = row[‘text_column’]
label = row[‘label_column’]
actual_labels.append(label)

# 4way-1shot #
response = openai.chat.completions.create(
    model="gpt-3.5-turbo-1106",
    temperature=0,
    messages=[
    {"role": "system", "content": " ~~ classification task ~~"}, 
    {"role": "user", "content": "~~~~"},
    {"role": "assistant", "content": "class 1"},
    {"role": "user", "content": " ~~~~ "},
    {"role": "assistant", "content": "class 2"},
    {"role": "user", "content": " ~~~~ "},
    {"role": "assistant", "content": "class 3"},
    {"role": "user", "content": " ~~~~ "},
    {"role": "assistant", "content": "class4"},
    {"role": "user", "content": text}

]
)
predicted_label = response.choices[0].message.content.strip()
predicted_labels.append(predicted_label)

output_filename = f'{label}_{index+1}.txt'
with open(output_filename, 'w', encoding='utf-8-sig') as outfile:
    response_text = json.dumps(response, indent=4, default=lambda x: x.__dict__)
    outfile.write(response_text)

gpt-3.5-turbo-1106 has a big problem with timeouts while not giving a response. The non-versioned version will be more reliable.

You’ll need to implement error handling and retries in the case of timeout, (while the python library itself does one retry, hiding the error from you.) Besides the program crashing mid-job, you could have missing data.

Then cap that waiting time when you create your client object (set to a bit more than the longest time needed):
`client = OpenAI(timeout=30)’

Ideally you’d use asyncio and queued parallel calls to dispatch the jobs to the API, and can likely do at least 10 per minute at the tier 1 rate limit.

Thanks to your reply.
I have another question about my problem.
Although gpt-3.5-turbo-1106 model has a big problem with timeouts,
most users don’t have any problem about timeouts.
So, what i want to say is
Under what conditions does the timeout problem occur?
There’s nothing wrong with the other person who uses the same api.