Iām having the same problem. I can retrieve the āfileā from the API but I havenāt been able to figure out how to decode it. I am getting a PNG image created but canāt seem to get it to an openable format.
Itās crazy how hard this was to figure out - had to go digging through the SDK to put the pieces together.
This works for me:
import OpenAI from 'openai';
import fs from 'fs';
(async function run() {
const fileid = 'file-kqzPeg6MhD0HoCaDnaK3XSJN';
console.log('Loading ', fileid);
const openai = new OpenAI();
const file = await openai.files.content(fileid);
console.log(file.headers);
const bufferView = new Uint8Array(await file.arrayBuffer());
fs.writeFileSync('file-kqzPeg6MhD0HoCaDnaK3XSJN.png', bufferView);
})();
Replace fileid with the image_file.file_id or other ids you get and it should work. Thereās no easy way to figure out the extension programmatically other than to go digging through the headers and to parse them.
That is the way to download it via python programmatically, ok. It works. But what if I want to download the generated file while Iām in the playground using Safari? The link to download the file gives error and says that the javascript is not valid, while the button to download the file in the files section doesnāt do anything. Telling the Assistant that the download link doesnāt work, or asking for a link to the file, even specifying the file ID, will make the assistant answer with a link to a google server!!
I am getting the same behaviour in chatGPTā¦ canāt download files being presented, just links to sandbox:/mnt/data/ā¦ or then ā://bellard.org/textsynth/assets/chatgpt/sandbox?path=/mnt/data/image_recolored.pngā if i press for alternatives.
It is possible via the client by using the file_id.
The steps are:
Get the file_id from the thread
Load the bytes from the file using the client
Save the bytes to file
If working in python:
# open_ai_client = ...
# thread = ...
def get_response(thread):
return open_ai_client.beta.threads.messages.list(thread_id=thread.id)
def get_file_ids_from_thread(thread):
file_ids = [
file_id
for m in get_response(thread)
for file_id in m.file_ids
]
return file_ids
def write_file_to_temp_dir(file_id, output_path):
file_data = open_ai_client.files.content(file_id)
file_data_bytes = file_data.read()
with open(output_path, "wb") as file:
file.write(file_data_bytes)
# So to get a file and write it
file_ids = get_file_ids_from_thread(thread)
some_file_id = file_ids[0]
write_file_to_temp_dir(some_file_id, '/tmp/some_data.txt')
@nikunj,
Iām still unclear on how to make files generated by the code interpreter downloadable.
The docs say:
āWhen annotations are present in the Message object, youāll see illegible model-generated substrings in the text that you should replace with the annotations.ā
In the Message Annotations docs for file_path annotations the code shows how to reword the annotation and that we need to separately download the file.
But doesnāt cover how to make the file download on the client when the message is clicked.
# Iterate over the annotations and add footnotes
for index, annotation in enumerate(annotations):
# Replace the text with a footnote
message_content.value = message_content.value.replace(annotation.text, f' [{index}]')
# Gather citations based on annotation attributes
if (file_citation := getattr(annotation, 'file_citation', None)):
cited_file = client.files.retrieve(file_citation.file_id)
citations.append(f'[{index}] {file_citation.quote} from {cited_file.filename}')
elif (file_path := getattr(annotation, 'file_path', None)):
cited_file = client.files.retrieve(file_path.file_id)
citations.append(f'[{index}] Click <here> to download {cited_file.filename}')
# Note: File download functionality not implemented above for brevity
Can you please elaborate more on
# Note: File download functionality not implemented above for brevity
How the cited_file is meant to be attached to the citation?
These instructions imply that I can download the file from the client directly from openAIās servers.
It makes sense that the files wouldnāt be downloadable from a URL from a security perspective.
When receiving a file from the code interpreter in the Assistants playground I am directed to the files page and then have to click the download button to get my file.
For now, Iām planning to return URL to hit a custom endpoint that will download the file.
from django.core.files.base import ContentFile
def download_and_save_file(self, file_id, db_row_instance):
"""
Download a file and store it in the DB/S3 bucket
Args:
file_id: The ID of the file to download.
db_row_instance: The db_row_instance instance to attach the file to.
Returns:
File: File instance
"""
file_data = self.open_ai.files.content(
file_id=file_id,
)
file_data_bytes = file_data.read()
# Create a ContentFile with the file data
content_file = ContentFile(file_data_bytes)
file_name_with_extension = f"{file_id}.png"
# Save the ContentFile to the db_row_instance generated_file field
db_row_instance.generated_file.save(file_name_with_extension, content_file)
My man THANK YOU. Iāve been trying to figure this out for about 8 hours. Wish I was exaggerating. I could NOT access that sandbox link to save my lifeā¦
QUICK NOTE: Had to remove ā.idā from āthread_id=thread.idā in get_response. Other than that, code works right out of the box.
Cheers
My adjustments in case anyone is a nerd like me:
import os
from openai import OpenAI
from dotenv import load_dotenv
from colorama import Fore
"""
Variation of Current API Calling Format as of 12/26/23
"""
load_dotenv()
try:
client = OpenAI(
api_key=os.environ['OPENAI_API_KEY']
)
if not client.api_key:
raise ValueError("API key is missing. Check .env file.")
except KeyError as e:
raise ValueError(f"Error Occurred: {e}\n\nCheck .env file.")
print(Fore.GREEN + f'API KEY: {client.api_key}')
thread_id = 'thread_YourThread123'
output_path = '/path/to/your/output/file'
"""
Obtain the File IDs within the Specified Thread
"""
def get_response(thread_id):
return client.beta.threads.messages.list(thread_id=thread_id)
def get_file_ids_from_thread(thread):
file_ids = [
file_id
for m in get_response(thread)
for file_id in m.file_ids
]
return file_ids
"""
Write Each File ID's Contents with Separator Implementation for Readability
"""
def write_file(file_id, count, output_path=output_path):
file_data = client.files.content(file_id) # Extract the content from the file ID
file_content = file_data.read() # Assign the content to a variable
separator_start = f'\n\n\n\nFILE # {count + 1}\n\n\n\n'
separator_end = '\n\n\n\n' + '#' * 100 + '\n\n\n\n'
with open(output_path, "ab") as file:
file.write(separator_start.encode()) # Encode the string to bytes
file.write(file_content) # Write the content
file.write(separator_end.encode()) # Encode the string to bytes
"""
Iterate through the File IDs while Calling write_file for File Output
"""
file_ids = get_file_ids_from_thread(thread_id) # Retrieve file IDs
print('\nFILE IDS: ', file_ids)
print('\nNUMBER OF FILE IDS: ', len(file_ids))
for count, file_id in enumerate(file_ids):
print(Fore.GREEN + f'\nWriting file #{count + 1}...\n')
write_file(file_id, count) # Write file ID contents
print(Fore.GREEN + f'File {count + 1} written.\n')
print('Done.')