Assistants API | Adding a file during run + submit_tool_outputs

The problem is

run.file_ids.append(uploaded_file.id)

does not seem to submit the file it to the run on OpenAI servers, but it’s just a local object.

Yeah there doesn’t seem to be a reason why that would work, as there is no run object submitted to the API.

Sorry for the delay. Here is a hypothetical scenario:

  1. User asks a question to an Assistant that has both, a custom function and Retrieval enabled.
  2. The custom function would gather information based on the user’s question and, instead of submitting that information as a long string or JSON, it would create a file, upload it to OpenAI using client.create.files(), and submit the FileID from the upload to the Run while it is in a “requires_action” state.
  3. The Run would continue and use Retrieval to open / analyze the file and ultimately respond to the user (key here is that it would be all done in one Run).

FYI, I heard back from an OpenAI staff member in the community and they confirmed that this functionality does not currently exist, so please don’t waste too much of your time. I’ve asked them to include it and he said he will since he also sees a use-case for this, but it is not even on their current roadmap.

Appreciate the help!

1 Like

Appreciate the approach here. I actually tried this as one of my first attempts to resolve the issue, but the results are unpredictable – sometimes it works great and other times the Assistant has trouble connecting the two Runs. It also gets very costly, as I place my instructions at the Run-level (due to some specifics of our use-case), so the token counts add up when all we need the second Run for is to attach the file in the Message. Thanks tho.

Thanks for this, but I’m also having trouble following along on how this would work, as “run” would only be a local object and doing “run.file_ids.append()” would only alter the local object since it’s not being submitted back to OpenAI. I know there is a modifyRun (https://platform.openai.com/docs/api-reference/runs/modifyRun) API call, but that only allows the “metadata” to be updated.

Am I missing something here? I’m experiencing similar results to @akabiri-collectivei and understand the point raised by @jorgeintegrait about nothing being submitted to the API.

So that was done is during the function call (or tool call).
so for example:

###note, alot of validation and setup are omitted for brevity
toolCallList=run.required_action.submit_tool_outputs.tool_calls

toolOutputs=[]
for tool in toolCallList:
    #call the function toolFunction()
   toolFunc=functionLookup.get(tool.function.name)

   #pass in the run object from the current run
   kwargs=json.loads(tool.function.arguments)
   kwargs['run']=run

   output=toolFunc(**kwargs)
   toolOutputs.append(output)

#finish up with the tooloutput submission

#in another file for the definition of toolFunction
#a few assumptions the assistant is a accessible and able to add the file in.
def toolFunction(run,arg1):
    #add the file to the assistant
    run.file_ids.append(file_id)

While at times we do observe the annotations can be missing, but it has been working fine. of course more testing is required.

if anyone found a solution i have a similar problem, i try to upload a file to my function :

{
“name”: “send_email_to_zapier”,
“description”: “Send an email via Zapier webhook”,
“parameters”: {
“type”: “object”,
“properties”: {
“email”: {
“type”: “string”,
“description”: “The recipient’s email address”
},
“subject”: {
“type”: “string”,
“description”: “The subject of the email”
},
“message”: {
“type”: “string”,
“description”: “The body content of the email”
},
“file”: {
“type”: “string”,
“description”: “The attached file content encoded in base64”
}
},
“required”: [
“to_email”,
“subject”,
“message”
]
}
}

But i don’t how to do it, because when i ask my assistant to generate a pdf and send it, i don’t receive a file from the assistant even if he’s telling me he did.

I don’t know if i’m clear sorry :sweat_smile: i’m a newbie in this asssistant API stuff

So with Assistants API V2 here now, and the files connected to the Assistant, can we use ModifyRun via a function call to add files to its own active run? Again, OP wants it to happen seamlessly without having to restart the assistant.

I looked right when v2 of the API came out, but the only the “metadata” can be altered with modifyRun. Still doesn’t look like there is a solution to provide a file during an active run.

I’m going to try and see if I can run client.beta.assistants.update(assistant.id, tool_resources) as a function call and see if that updates its file_ids while the run is still active. Otherwise, I may try to initiate a new run as a function call after doing that…

It seems to work.
I’m starting out with the assistant defined with the code_interpreter tool but no file_ids as resources. In addition, it has a function “add_file_to_assistant” that uploads a csv to Openai, then adds the file_id to the assistant with:
client.beta.assistants.update(assistant_id=assistant_id, tool_resources=tool_resources_updated).
With a single user prompt, it then calls the function first and then the code_interpreter uses the file to answer the user question. In a single run.

1 Like

Only way is to submit tool output with some success message and let the run complete then submit a new message with the file ids and create a new run that then has access to the file. Only solution until they support passing file ids as tool outputs.

1 Like

I’m running into this suprised more folks arent. Here’s my usecase and my hacky solve which I’m finding a bit of consistency with.

  1. user asks a question, think SQL
  2. Assistant writes and runs a query, (Can return many rows, way too many to stick in the actual thread)
  3. Assistant can upload the dataset for code-interpreter to compete the analysis.

My solve is to cancel the run when the upload_file function is called, Create a fake user message (hidden from the end user), with the uploaded files attached, then re-start a new run automatically.

And this wasn’t working with any consistency until I adjusted the fake user message to be something like this:

“I have uploaded the file for you to now use!”

Basically role play the user uploading the file themselves and the assistant doesn’t get confused so much.

2 Likes

For anyone looking at this now, Rob360’s method is what worked for me.
Flow:
Assistant calls function with sql query as argument. The function saves the results of the query as csv and file is uploaded to storage:

file_upload = client.files.create(
            file=open(csv_file_path, "rb"),
            purpose='assistants'
        )

The function then updates the tool resources with the file id as Rob suggested.

tool_resources_updated = {
            "code_interpreter": {
                "file_ids": [file_id]
            }
        }

        client.beta.assistants.update(
            assistant_id='assistant_id',
            tool_resources=tool_resources_updated
        )

The function returns file id which is submitted as tool_outputs.
I had to clearly instruct the assistant to activate the code interpreter on the file id from the function call and it works!

2 Likes

is it applicable for the file_search too?

I will definitely try this, thanks.

Do you have any idea how to achieve the opposite way consistently? If the code interpreter generates a file which I want to use in another tool, I currently have to hope that the assistant consistently posts a link - otherwise the file does not leave the sandbox environment. I pull the file ID from the annotations and download the file. Next, the assistant needs to be told the file-ID which it does not know. This is necessary to ensure it can reference the correct file as input for the function tool call. I am relying on a fake user message to do that, but the method is very touch and go at the moment.

This worked for me also. The one problem is that the file is scoped to Assistant which means users that are sharing an Assistant will have access to the files. This is a problem in my use case.

This was an easy fix by modifying the thread instead.

tool_resources = {"code_interpreter": {"file_ids": file_ids} }
client.beta.threads.update(
       thread_id=thread.id,            
       tool_resources=tool_resources
)

I note that Chatgpt seems to handle this issue seamlessly. I have a custom GPT that queries files from an API and can instantly use the files in Interpreter. My Assistants version does not work quite as well. If anyone has any other solutions, please post.

Yep.

I built an entire application around this - so that during a call, the LLM would produce data → that data updates documents (whether on-disk/server or “ephemeral in database only”) → that data then included in the next internal re-prompting of the LLM (if desired) → also that data (the output of run #1) can be purposefully parsed and filtered based on parameters of run #2 (not all data that was saved in file needs to be provided necessarily - can be filtered before including in run #2 context window) → continues indefinitely until results pass inspection → return to user facing

I’m not sure how useful or not the agents/assistants SDK has been for folks - but I tell you what - if you do it yourself you can get it all to work beautifully… and it’s awesome. And there’s no limitations on how you route/store the data…

But of course, you’ve got to switch approaches a bit to handle everything yourself in terms of data storage and retrieval - I don’t know if the SDKs or endpoints can “work partially in-house and partially in OAS’s backend” - I’m guessing no - but hey - take all those dev hours and just put them into your own backend and use the much-faster completions endpoint, and control all processing, retrieval, storage, tools, threads, flows, etc, on your own end…

So just my two cents and nudge to say “use the LLM as a tool within your desired system, and the build the system yourself” - as opposed to using a system that doesn’t have the functionality, flexibility, or modifiability that’s necessary to achieve your goals.