How do download files generated in AI Assistants?

shrikar84 · November 10, 2023, 6:48pm

How do I go about downloading files generated in open AI assistant?
I have file annotations like this

 TextAnnotationFilePath(end_index=466, file_path=TextAnnotationFilePathFilePath(file_id='file-7FiD35cCwF6hv2eB7QOMT2rs'), start_index=427, text='sandbox:/mnt/data/stat_sig_pie_plot.png', type='file_path')]

I am not able to read the file object using .read()

jeffre4828 · November 10, 2023, 9:40pm

I’m having the same problem. I can retrieve the ‘file’ from the API but I haven’t been able to figure out how to decode it. I am getting a PNG image created but can’t seem to get it to an openable format.

hrishi · November 11, 2023, 4:57am

It’s crazy how hard this was to figure out - had to go digging through the SDK to put the pieces together.

This works for me:

import OpenAI from 'openai';
import fs from 'fs';

(async function run() {
  const fileid = 'file-kqzPeg6MhD0HoCaDnaK3XSJN';
  console.log('Loading ', fileid);
  const openai = new OpenAI();

  const file = await openai.files.content(fileid);

  console.log(file.headers);

  const bufferView = new Uint8Array(await file.arrayBuffer());

  fs.writeFileSync('file-kqzPeg6MhD0HoCaDnaK3XSJN.png', bufferView);
})();

Replace fileid with the image_file.file_id or other ids you get and it should work. There’s no easy way to figure out the extension programmatically other than to go digging through the headers and to parse them.

If anyone has better ways please do share them!

logankilpatrick · November 11, 2023, 11:52pm

I am adding some more example code in the section of the docs that goes over this: OpenAI Platform just waiting for an approval and deploy.

shrikar84 · November 13, 2023, 6:26am

Cool thanks folks figured it out using SDK code. Forgot to update here

fmuaddib · November 13, 2023, 10:19am

The text you added to the documentation says that the link is of the form:

sandbox:/mnt/data/shuffled_file.csv

But that is not a valid url at all. How we are supposed to get those files from the playground?

danielbm87 · November 13, 2023, 10:26pm

you have to use

await openai.files.content

to get the file

fmuaddib · November 15, 2023, 8:34pm

That is the way to download it via python programmatically, ok. It works. But what if I want to download the generated file while I’m in the playground using Safari? The link to download the file gives error and says that the javascript is not valid, while the button to download the file in the files section doesn’t do anything. Telling the Assistant that the download link doesn’t work, or asking for a link to the file, even specifying the file ID, will make the assistant answer with a link to a google server!!

‘’‘https://storage.googleapis.com/assistant-sandbox-attachments/session_WuFUqxIJG99AoTWnihXVz/file-85NrdwClpdBrNbZLywortwj9’‘’

If I ask again, the Assistant apologizes, and then answer with this other link:

‘’‘https://assistant-sandbox.dynalistcdn.com/sandbox/file-85NrdwClpdBrNbZLywortwj9’‘’

Both are completely bogus links…

Conclusion: ATM there is no way to download a file generated by the AI using the browser when using the playground to test the assistants.

eoinpayne · November 20, 2023, 2:35pm

I am getting the same behaviour in chatGPT… can’t download files being presented, just links to sandbox:/mnt/data/… or then “://bellard.org/textsynth/assets/chatgpt/sandbox?path=/mnt/data/image_recolored.png” if i press for alternatives.

harry_salmon · November 21, 2023, 5:21pm

It is possible via the client by using the file_id.

The steps are:

Get the file_id from the thread
Load the bytes from the file using the client
Save the bytes to file

If working in python:

# open_ai_client = ...
# thread = ...

def get_response(thread):
    return open_ai_client.beta.threads.messages.list(thread_id=thread.id)

def get_file_ids_from_thread(thread):
    file_ids = [
        file_id
        for m in get_response(thread)
        for file_id in m.file_ids
    ]
    return file_ids

def write_file_to_temp_dir(file_id, output_path):
    file_data = open_ai_client.files.content(file_id)
    file_data_bytes = file_data.read()
    with open(output_path, "wb") as file:
        file.write(file_data_bytes)

# So to get a file and write it
file_ids = get_file_ids_from_thread(thread)
some_file_id = file_ids[0]
write_file_to_temp_dir(some_file_id, '/tmp/some_data.txt')

nikunj · November 21, 2023, 7:06pm

This section in the documentation explains how you can download files generated by tools. https://platform.openai.com/docs/assistants/tools/file-citations

tl;dr: you have to look for file_ids in the content array of the message and then download the file using https://platform.openai.com/docs/api-reference/files/retrieve-contents

d.albertazzi10 · December 6, 2023, 7:42pm

@nikunj,
I’m still unclear on how to make files generated by the code interpreter downloadable.

The docs say:
“When annotations are present in the Message object, you’ll see illegible model-generated substrings in the text that you should replace with the annotations.”

In the Message Annotations docs for file_path annotations the code shows how to reword the annotation and that we need to separately download the file.

But doesn’t cover how to make the file download on the client when the message is clicked.

# Iterate over the annotations and add footnotes
for index, annotation in enumerate(annotations):
    # Replace the text with a footnote
    message_content.value = message_content.value.replace(annotation.text, f' [{index}]')

    # Gather citations based on annotation attributes
    if (file_citation := getattr(annotation, 'file_citation', None)):
        cited_file = client.files.retrieve(file_citation.file_id)
        citations.append(f'[{index}] {file_citation.quote} from {cited_file.filename}')
    elif (file_path := getattr(annotation, 'file_path', None)):
        cited_file = client.files.retrieve(file_path.file_id)
        citations.append(f'[{index}] Click <here> to download {cited_file.filename}')
        # Note: File download functionality not implemented above for brevity

Can you please elaborate more on

# Note: File download functionality not implemented above for brevity

How the cited_file is meant to be attached to the citation?

Thanks in advance!

d.albertazzi10 · December 6, 2023, 8:01pm

These instructions imply that I can download the file from the client directly from openAI’s servers.

It makes sense that the files wouldn’t be downloadable from a URL from a security perspective.

When receiving a file from the code interpreter in the Assistants playground I am directed to the files page and then have to click the download button to get my file.

For now, I’m planning to return URL to hit a custom endpoint that will download the file.

(http://localhost:3000/api/assistant/file/{file-id})

nikunj · December 8, 2023, 7:13pm

You can use the file ID in the citation to download the file using this endpoint: https://platform.openai.com/docs/api-reference/files/retrieve-contents

nandangrover · December 12, 2023, 3:42pm

If someone wants to implement this in Django:

    from django.core.files.base import ContentFile

    def download_and_save_file(self, file_id, db_row_instance):
        """
        Download a file and store it in the DB/S3 bucket

        Args:
            file_id: The ID of the file to download.
            db_row_instance: The db_row_instance instance to attach the file to.

        Returns:
            File: File instance
        """
        file_data = self.open_ai.files.content(
            file_id=file_id,
        )
        file_data_bytes = file_data.read()
        # Create a ContentFile with the file data
        content_file = ContentFile(file_data_bytes)
        
        file_name_with_extension = f"{file_id}.png"
        # Save the ContentFile to the db_row_instance generated_file field
        db_row_instance.generated_file.save(file_name_with_extension, content_file)

johndoughsins · December 27, 2023, 2:09am

My man THANK YOU. I’ve been trying to figure this out for about 8 hours. Wish I was exaggerating. I could NOT access that sandbox link to save my life…

QUICK NOTE: Had to remove ‘.id’ from ‘thread_id=thread.id’ in get_response. Other than that, code works right out of the box.

Cheers

My adjustments in case anyone is a nerd like me:

import os
from openai import OpenAI
from dotenv import load_dotenv
from colorama import Fore

"""
Variation of Current API Calling Format as of 12/26/23
"""
load_dotenv()
try:
    client = OpenAI(
        api_key=os.environ['OPENAI_API_KEY']
        )
    if not client.api_key:
        raise ValueError("API key is missing. Check .env file.")
except KeyError as e:
    raise ValueError(f"Error Occurred: {e}\n\nCheck .env file.")
print(Fore.GREEN + f'API KEY: {client.api_key}')


thread_id = 'thread_YourThread123'
output_path = '/path/to/your/output/file'


"""
Obtain the File IDs within the Specified Thread
"""
def get_response(thread_id):
    return client.beta.threads.messages.list(thread_id=thread_id)

def get_file_ids_from_thread(thread):
    file_ids = [
        file_id
        for m in get_response(thread)
        for file_id in m.file_ids
    ]
    return file_ids


"""
Write Each File ID's Contents with Separator Implementation for Readability
"""
def write_file(file_id, count, output_path=output_path):
    file_data = client.files.content(file_id) # Extract the content from the file ID
    file_content = file_data.read() # Assign the content to a variable
    separator_start = f'\n\n\n\nFILE # {count + 1}\n\n\n\n'
    separator_end = '\n\n\n\n' + '#' * 100 + '\n\n\n\n'

    with open(output_path, "ab") as file:
        file.write(separator_start.encode())  # Encode the string to bytes
        file.write(file_content) # Write the content
        file.write(separator_end.encode())    # Encode the string to bytes


"""
Iterate through the File IDs while Calling write_file for File Output
"""
file_ids = get_file_ids_from_thread(thread_id) # Retrieve file IDs
print('\nFILE IDS: ', file_ids)
print('\nNUMBER OF FILE IDS: ', len(file_ids))
for count, file_id in enumerate(file_ids):
    print(Fore.GREEN + f'\nWriting file #{count + 1}...\n')
    write_file(file_id, count) # Write file ID contents
    print(Fore.GREEN + f'File {count + 1} written.\n')

print('Done.')

wilburgarlandn7042 · June 17, 2024, 4:19pm

sandbox：/mnt/data/qr_with_robin_background.png

rothgiles · July 9, 2024, 4:39am

… I just asked it to make them downloadable easily and it gave me links to the .png’s

Nisha21 · July 18, 2024, 10:22pm

I simply just asked it to provide the info in text form

marcolivierbouch · July 19, 2024, 12:24am

In the annotations when a message is returned you can find the file id, with this file ID you can fetch it using the SDK

Topic		Replies	Views
Assistant API - Code Interpreter Image File Creation (files.retrieve_content) Bug Bugs error	16	4903	February 1, 2024
How to save image file returned from the code interpreter tool? API code-interpreter , assistants	5	6075	November 13, 2023
How to download the file that comes from annotations? API gpt-4	2	1275	December 8, 2023
Error trying to download csv or excel file using Assistants API Bugs assistants-api	2	1806	November 21, 2023
I'm not able to download files that I have either uploaded via FILES or an Assistant Bugs assistants , file-uploads	3	2608	November 21, 2023

How do download files generated in AI Assistants?

Related Topics