Pre-requisites
Environment Information
│ List of packages in environment:
│
│ Name Version Build Channel
│ openai 1.58.1 pyhd8ed1ab_0 conda-forge
│ prefect 3.0.0 pypi_0 pypi
Model: gpt-4o-2024-11-20
Image shape: (1280, 720, 3)
Issue Description
I am running a Prefect pipeline where one of the tasks involves asking a question via an asynchronous client (AsyncOpenAI). The _create_client method is invoked at the start of the program, and I use the AsyncClient to handle requests. Although my setup is more complex, the core issue can be summarized as follows:
Problem:
When the request contains a large base64-encoded image, the program hangs or freezes. However, if I wait 10 minutes, it eventually returns the correct response. Notably, the issue does not occur in the following situations:
Observations & Workarounds
1. When running the same request in a Jupyter notebook.
2. When using the standard synchronous OpenAI client instead of the asynchronous one.
3. When the image is removed from the request.
4. When shortening the base64 string manually.
5. When using the gpt-4o-mini-2024-07-18 model instead of gpt-4o-2024-11-20.
6. When uncommenting _refresh_client(), which refreshes the client instance before making the request.
Investigations & Dismissed Theories
1. Event loop mismatch: Initially suspected, but dismissed since removing the image fixes the issue.
2. Incomplete child event loop handling: Also dismissed for the same reason as above.
Code Snippets
Task Code:
class AIStep:
@task(task_run_name="{self.step_name}")
async def task(self, question, **kwargs):
content = []
content.append({'type': 'image_url', 'image_url': {'url': f"data:image/png;base64,{base64_image}"}})
content.append({"type": "text", "text": question})
reply = await self.llm_provider._prompt([{'role': 'user', 'content': content}])
Prompter Code:
class Prompter:
async def _prompt(self, input_text, *, tools=None, **kwargs):
# self._refresh_client()
response = await self.client.chat.completions.create(
model=self.model, messages=input_text, temperature=0.0, tools=tools
)
if len(response.choices) > 0:
if len(response.choices) > 1:
self.logger.warning(f"Too many responses {response.choices}")
return response
else:
raise Exception(f"Call to OpenAI didn't yield any choices: {response} for {input_text[0:200]}")
Client Creation & Refresh Code:
def _create_client(self):
from openai import AsyncOpenAI
return AsyncOpenAI(api_key=self.cfg["token"])
def _refresh_client(self):
self.client = self._create_client()
self.logger.info(f"Client refreshed for: {self}")
Questions
1. Why does the response take 10 minutes to arrive?
Is there any known issue related to large base64-encoded image payloads or timeouts in the OpenAI API?
2. Am I being charged if I cancel the request before the 10-minute response?
Does the API meter usage based on received requests or only on completed responses?
Any insight on this behavior or guidance on how to better handle such cases would be appreciated!