Every time I send a streaming request to gpt-4
it starts fine, streaming chunks as expected. But it always gets interrupted by this ChunkedEncodingError
:
requests.exceptions.ChunkedEncodingError: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
I see on Github this has been an issue going back months. I’m curious if anyone else has seen this and if you can recommend any workarounds. I’ve tried putting the iteration over the response object in a try/except
block but once it reaches this state I don’t know how to get back to streaming completions.
Edited to include code, adapted from the OpenAI Cookbook:
# vars declared above...
response = openai.ChatCompletion.create(
model=model,
messages=messages,
temperature=temperature,
top_p=top_p,
frequency_penalty=freq_penalty,
presence_penalty=pres_penalty,
stream=True,
)
# Response stream
for chunk in response:
# Mark the time (for reporting only)
chunk_time = time.time() - start_time
last_segment_time += time.time() - last_tick
last_tick = time.time()
# Collect streamed chunk
collected_chunks.append(chunk)
chunk_message = chunk["choices"][0]["delta"]
collected_messages.append(chunk_message)
segments.append(chunk_message)
with open(out_path, "a") as file:
file.write(chunk_message.get("content", ""))
# Every 1 second, join the segment and print
if last_segment_time > 1:
segment_text = "".join(s.get("content", "") for s in segments)
tqdm.write(f"Received: {segment_text}")
segments = []
last_segment_time = 0
print(f"Full response received {chunk_time:.2f} seconds after request")