Filenames in code interpreter's assistant api

owengo · October 23, 2024, 12:45pm

When I create a custom gpt and upload a file, let’s say “mycode.py” for the code interpreter, the code interpreter can access it at:
/mnt/data/mycode.py

When I want to do the same with an assistant: I upload the same “mycode.py” for usage by the code interpreter, the file in /mnt/data is named:
file-p4zUlf6AkqMJBDVm8kA8xxZZF ( or another random name, matching the fileId of the upload ).
How can I have correct filenames in the code interpret in the assistant api? Am I missing something ?

tbraam · October 24, 2024, 12:54am

I’m also wondering about this… especially when I upload multiple files, the assistant cannot figure out what each file is.

arata · October 24, 2024, 9:43am

I have tried multiple API methods, and it seems there is no provided facility in Assistants to associate mount point uploaded files with an original file name.

That is despite the file being uploaded correctly and the metadata file name being returned:

FileObject(
id='file-Uv9vHHjWAszEFOo9D3qcJwub', 
bytes=1166, 
created_at=1729762200,
filename='mydata.json',
object='file',
purpose='assistants',
status='processed',
status_details=None
)

The assistant cannot find any method to retrieve the original files.

Here is the list of files you have uploaded:\n\n- ID: file-Uv9vHHjWAszEFOo9D3qcJwub, Original Name: file-Uv9vHHjWAszEFOo9D3qcJwub\n\nIt seems that the original name of the file is not user-friendly and is the same as the system-assigned ID. If you have any specific operations you would like to perform on this file, please let me know!

The augmentation you would have to perform would be to provide your own mapping of uploaded file names to mount point file IDs, perhaps by updating the assistant instruction with an additional section, or using additional_instructions if they are user files.

The ultimate augmentation would be to provide your own Jupyter notebook execution environment that has the statefulness and file versatility you desire, is deployed within the scope of a chat session, user, or group as may be desired, is free – and is offered as a function for Chat Completions.

owengo · October 28, 2024, 2:07pm

I came to the same conclusions, but I am wondering how they make it work with the custom gpts? Do you think it’s 100% prompt-engineering, with additional_instructions appended every time a new file is attached to the code execution environment ?

arata · October 28, 2024, 3:44pm

We can see that it works logically for OpenAI’s product but not for you the Assistants API developer, by the additional information that is placed in context in a GPT, noting the mount point name:

Gizmo uploaded file with ID ‘file-ImnU16MgHOCaqBN5aG57izvZ’ to: /mnt/data/get-thread-messages-and-save.py.
Gizmo uploaded file with ID ‘file-UZNu3dnYX949baiuLitZl1D9’ to: /mnt/data/streaming_helper.py.
Gizmo uploaded file with ID ‘file-yIdBZ2n7nZ1hpVikjbNvfMiU’ to: /mnt/data/list-vector-stores.py.

All the files uploaded by the user have been fully loaded. Searching won’t provide additional information.

Code interpreter will return the original file names when doing a ls of the mount point.

The text about “fully loaded” is that the full text of these small files is placed into the context window, which the AI can reproduce with no tool call – files also going to file search if supported.

For Assistants, however:

Besides a map of original name to mount point file ID placed into context just for reference, the creative person could tell the Assistants AI that it must rename files as the first thing sent to the Python notebook, before continuing with the user task. Send the actual script with the actual map.

# Define the mapping of blob file name to original file names
file_name_map = {
    'file-xxx': 'original.1',
    'file-yyy': 'original.2',
    'file-zzz': 'original.3'
}

# Create symlinks for each file with the new names
for original_file, new_name in file_name_map.items():
    original_path = os.path.join('/mnt/data', original_file)
    symlink_path = os.path.join('/mnt/data', new_name)
    os.symlink(original_path, symlink_path)

# List the files again to confirm symlinks creation
os.listdir('/mnt/data')

owengo · October 30, 2024, 4:58pm

Ok I see the idea, thanks.
Actually we probably would have to maintain a file with the mappings in each thread context. Thinking of it this would explain why we can attach less files to a running code interpreter in a custom gpt than in an assistant, they probably have some slots reserved for the gory “mappings”.

Topic		Replies	Views
Assistant mixes files up when running code API assistants-files	5	604	March 28, 2024
Assistant API not recognizing file type base on the file name API bug , gpt-4-turbo	12	2950	January 24, 2025
Unable to save Assistant after uploading a file to Code Interpreter Bugs code-interpreter , assistants-api , file-uploads , platform	6	246	December 11, 2025
Error creating "assistants_output" files API	7	636	July 29, 2024
Assistants API - File "purpose" confusion API assistants-api	2	984	February 14, 2025

Filenames in code interpreter's assistant api

Related topics