Assistant API - Code Interpreter Image File Creation (files.retrieve_content) Bug

davenorris · November 8, 2023, 12:45am

When retrieving a file that was created by an Assistant via the Code Interpreter as a tool (Matplotlib) the retrieve_content request seems to return a string (cast_to=str).

The same file can be downloaded via the Playground but when using the API to write to a local file using files.retrieve_content it fails to create the file correctly.

To reproduce this error, try asking an assistant to create a graph for you. It will use the tool Code Interpreter and will create the file which will be returned in the thread messages. This file_id can then be used to view the file via the Playground. However, if you want to render this file in the application you are building, you can not.

I hope this is enough definition to correct what I believe is a bug as I have tried various encoding to no avail.

ric.porteous1989 · November 8, 2023, 1:00am

Having the same error here. Here is some code that I used to recreate this:

assistant = client.beta.assistants.create(
  name="Data visualizer",
  description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
  model="gpt-4-1106-preview",
  tools=[{"type": "code_interpreter"},{"type": "retrieval"}]
)

thread = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Can you please give me a simple example of a scatter plot. Please save as a png"
)

run = client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id
)

messages = client.beta.threads.messages.list(
  thread_id=thread.id
)

messages

SyncCursorPage[ThreadMessage](data=[ThreadMessage(id='...', assistant_id='...', content=[MessageContentImageFile(image_file=ImageFile(file_id='my_file_id')....

Then when retrieving the contents of the file and encoding

content = client.files.retrieve_content('my_file_id')
Image(bytes(content,'utf-8'))

Turns out to be a corrupted image. Like @davenorris , tried a whole bunch of methods and encodings.

davenorris · November 8, 2023, 1:56am

Just adding here that the request to the assistant instructions to save as a PNG is not needed for this to happen.

davenorris · November 8, 2023, 1:22pm

Okay, so @ andreas over on Discord was able to chime in and help us figure this out : Discord

Here is the solution:

# handle image file
api_response = client.files.with_raw_response.retrieve_content(r.image_file.file_id)

if api_response.status_code == 200:
  content = api_response.content
  with open('image.png', 'wb') as f:
    f.write(content)
  print('File downloaded successfully.')

github.com/openai/openai-python

client.files.retrieve_content only returns strings, and not bytes/binary data

opened 02:14AM - 07 Nov 23 UTC

eware-godaddy

When I try to download/retrieve a binary file (eg. PNG image) created by an assi…stant, it get automatically [cast to a string](https://github.com/openai/openai-python/blob/e0aafc6c1a45334ac889fe3e54957d309c3af93f/src/openai/resources/files.py#L229), so it can't be correctly parsed/displayed. Eg: ``` py ret_file = client.files.retrieve_content('file-XXX') ret_file[:10] # '�PNG\r\n\x1a\n\x00\x00' ``` There doesn't seem to be a clean way in the API to retrieve a file as raw bytes from what I can see. This is important for code interpreter scenarios where the agent returns binary files that need to be rendered, like Images. For others having this issue, you can just request the files directly using `requests` like: ``` py import requests from io import BytesIO file_id = 'file-XXXX' headers = { 'Authorization': f"Bearer {os.environ['OPENAI_API_KEY']}" } response = requests.get(f'https://api.openai.com/v1/files/{file_id}/content', headers=headers) Image.open(BytesIO(response.content)) ```

elroy · November 9, 2023, 2:12am

Can confirm that this works as well.

BTW r in r.image_file.file_id is the model’s response object for those wondering. Once saved, you’ll be able to use Image from Pillow to display the file.

from PIL import Image

im = Image.open('image.png')
im.show()

isaak2018 · November 14, 2023, 8:28pm

Could someone have a github link or entire code for the image file creation ?
I’m getting this error BadRequestError: Error code: 400 - {'error': {'message': 'Not allowed to download files of purpose: assistants', 'type': 'invalid_request_error', 'param': None, 'code': None}}

isaak2018 · November 14, 2023, 8:50pm

@elroy How IS r in r.image_file.file_id supposed to be defined in the code exactly?

elroy · November 15, 2023, 4:55am

Here’s how we get to r

prompt = input()

message = openai_client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content=prompt,
    file_ids=[file.id]
)

run = openai_client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id,
  instructions="Please be professional. The user has a premium account.",
)

r = messages.data[0]

isaak2018 · November 15, 2023, 10:24am

Thank you, I now have a different error. Any clue on this?
AttributeError: 'ThreadMessage' object has no attribute 'image_file'

Here is my entire code:

from openai import OpenAI
client = OpenAI(api_key='sk-...')
file = client.files.create(
    file=open("balancesheet.pdf", "rb"),
    purpose='assistants'
)

assistant = client.beta.assistants.create(
  name="Financial Visualizer",
  description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
  model="gpt-4-1106-preview",
  tools=[{"type": "code_interpreter"},{"type": "retrieval"}],
  file_ids=[file.id]
)

thread = client.beta.threads.create()
client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Can you please give me a simple example of a scatter plot"
)

run = client.beta.threads.runs.create(
  thread_id=thread.id,
  assistant_id=assistant.id
)

messages = client.beta.threads.messages.list(
  thread_id=thread.id
)

#messages


r = messages.data[0]

What the next argument I should pass after r ?

# handle image file
api_response = client.files.with_raw_response.retrieve_content(r.?.?)

if api_response.status_code == 200:
  content = api_response.content
  with open('image.png', 'wb') as f:
    f.write(content)
  print('File downloaded successfully.')

Here is my r output

ThreadMessage(id='msg_09C8CSuPEXQvP39kjj2wz7pp', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='Can you please give me a simple example of a scatter plot'), type='text')], created_at=1700040507, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_Q1iPMjIT0iGC9uoSDoBVHBTj')

messages output

SyncCursorPage[ThreadMessage](data=[ThreadMessage(id='msg_09C8CSuPEXQvP39kjj2wz7pp', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='Can you please give me a simple example of a scatter plot'), type='text')], created_at=1700040507, file_ids=[], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_Q1iPMjIT0iGC9uoSDoBVHBTj')], object='list', first_id='msg_09C8CSuPEXQvP39kjj2wz7pp', last_id='msg_09C8CSuPEXQvP39kjj2wz7pp', has_more=False)

davenorris · November 15, 2023, 1:36pm

You can essentially use an if statement to see whether the content_type is an image_file or text. You have a few options.

If you know you are going to be getting an image_file response you could use:

r.content[0].image_file.file_id

However, I will note that this has been fixed via the Python library as of v1.2.1.

The new

So you would effectively do:

client.files.content(r.content[0].image_file.file_id)

davenorris · November 15, 2023, 5:05pm

I strongly encourage you to take down that collab. You have exposed your API key. Do the following immediately:

Remove access to the collab.

Revoke access to the OpenAI API key here: OpenAI Platform

Hit me up on Discord and we can talk through your code if you would like: davidnorris

mitashi.hm2506 · November 17, 2023, 8:33am

Is there a way i can display the image without saving it?

davenorris · November 19, 2023, 5:51am

Negative. You will need to save the file to your local machine or server and then display that as the source.

Hi-Im-Aaron · November 21, 2023, 4:46am

I was able to display images generated by the assistant without saving using the below function:

import requests
from io import BytesIO

def displayFile(file_id):
file_id = file_id
headers = {
‘Authorization’: f"Bearer {‘YOUR_API_KEY’}"
}
response = requests.get(f’https://api.openai.com/v1/files/{file_id}/content’, headers=headers)
img = Image.open(BytesIO(response.content))
display(img)

davenorris · November 29, 2023, 1:12pm

I am curious as to why you would want to mimic an object when you have the object there to begin with. This particular way of doing it is memory intensive. Why not store the file and serve it? This seems like a lot of extra work if you were to want to retrieve the binary data again too.

Or was this just a response that displaying an image is possible without saving it? Many things are doable, whether you should do them or not is a different story lol.

ehartus.ramp105 · December 2, 2023, 6:47pm

Current solution for JS (in Deno) that works unlike openai.files.retrieveContent that at the moment returns broken file.

async function getFileContent(id: string) {
  const response = await fetch(`https://api.openai.com/v1/files/${id}/content`, {
    headers: {
      'Authorization': `Bearer ${OPENAI_API_KEY}`
    }
  });
  if (!response.ok) {
    throw new Error(`HTTP error! status: ${response.status}`);
  } else {
    const buffer = await response.arrayBuffer();
    return new Uint8Array(buffer);
  }
}

atenkhu01 · February 1, 2024, 6:49am

Hi Boss can you please help me out?

I keep on having the error " AttributeError: ‘OpenAI’ object has no attribute ‘file’ "
Given that, I am using VS code, my API is up to date, file path is correct, Python version = 3.10.10, OpenAI version =1.8.0

Below is my code:

import openai
from dotenv import load_dotenv
from openai import OpenAI
import os
import requests
import traceback
import time

Add this line to print the OpenAI library version

print(openai.version)

Load environment variables and set OpenAI API key

load_dotenv()
api_key = os.getenv(‘OPENAI_API_KEY’)

Ensure the API key is available

if not api_key:
raise ValueError(“The OPENAI_API_KEY must be set in the environment variables.”)

Initialize the OpenAI client with the API key

client = OpenAI(api_key=api_key)

def download_pdf(url, filename):
try:
response = requests.get(url)
response.raise_for_status()
with open(filename, “wb”) as file:
file.write(response.content)
return True
except Exception as e:
print(“Failed to download the PDF:”, e)
traceback.print_exc()
return False

def upload_file(client, file_path):
#def upload_file(file_path):
try:
with open(file_path, ‘rb’) as file:
file_response = client.file.create(file=file, purpose=‘answers’)#use clients file
return file_response.id
except Exception as e:
print(“Failed to upload file:”, e)
traceback.print_exc()
return None

#print(upload_file)

def upload_file(cleint,file_path):

try:

with open(file_path, ‘rb’) as file:

response = openai.File.create(file=file) # Use openai.File.create directly

return response.id

except Exception as e:

print(“Failed to upload file:”, e)

traceback.print_exc()

return None

def create_thread_and_message(client, assistant_id, prompt, file_id):
thread = client.thread.create()
thread_id = thread.id
message = client.message.create(thread_id=thread_id, role=“user”, content=prompt, file_id=file_id)
return thread_id, message.id

def create_run(client, assistant_id, thread_id):
run = client.run.create(assistant_id=assistant_id, thread_id=thread_id)
return run

def wait_on_run(client, run):
while run.status in [“queued”, “in_progress”]:
time.sleep(0.5)
run = client.run.retrieve(id=run.id)
return run

def get_run_messages(client, thread_id, last_message_id):
messages = client.message.list(thread_id=thread_id, order=“asc”, after=last_message_id)
return messages

Define the URL for the PDF and local directory to save the file

url = “https://etechyou.org/wp-content/uploads/2021/12/FURIOUS-ENTREPRENEURS-RULES-AND-REGULATIONS-Online-1.pdf”
local_directory = r"C:\Users\myname\OneDrive\Desktop\python steps" #“C:\Users\myname\OneDrive\Desktop\python steps\”
input_pdf = os.path.join(local_directory, “downloaded_pdf.pdf”)

Download the PDF

if not download_pdf(url, input_pdf):
raise Exception(“Failed to download PDF.”)

Upload the PDF to OpenAI and retrieve the file ID

file_id = upload_file(client,input_pdf)

if not file_id:
raise Exception(“Failed to upload PDF to OpenAI.”)

Create an Assistant

assistant = client.assistant.create(
name=“SummaryPandaV3”,
model=“gpt-4-1106-preview”,
tools=[{“type”: “retrieval”}]
)

Create a thread and a message asking to summarize the PDF

thread_id, message_id = create_thread_and_message(client, assistant.id, “Please summarize this document focusing on the most relevant points.”, file_id)

Create a run to process the thread

run = create_run(client, assistant.id, thread_id)

Wait for the run to complete

run = wait_on_run(client, run)

Retrieve the assistant’s messages after the user’s last message

messages = get_run_messages(client, thread_id, message_id)

Extract the response from the messages

summary = “”
for message in messages.data:
if message.role == “assistant”:
summary += message.content.text

Save the summary to a file

output_summary_path = os.path.join(local_directory, “summary.txt”)
with open(output_summary_path, “w”) as summary_file:
summary_file.write(summary)
print(f"Summary saved to {output_summary_path}")

Topic		Replies	Views
How to save image file returned from the code interpreter tool? API code-interpreter , assistants	5	4421	November 13, 2023
Downloading images generated by code interpreter API api	2	1091	December 16, 2023
How do download files generated in AI Assistants? API assistants-api	15	11596	December 27, 2023
Not allowed to download files of purpose: assistants API assistants-api	5	2655	December 15, 2023
Error trying to download csv or excel file using Assistants API Bugs assistants-api	2	1398	November 21, 2023