Assistant api, retrieval file api is not working

filip.byren · November 28, 2023, 6:28am

Thank you for sharing, will definitely try this out

Mosredna · November 28, 2023, 6:52am

Ran into the same issue, and this fix sounded too GPT to be true
So I tried it, and it works!

filip.byren · November 28, 2023, 7:22am

Have you verified that it actually get parts of the file into the context and not hallucinate a response when this bug fix occurs

adaptiv · November 28, 2023, 9:49am

Good point @filip.byren. OpenAi needs to fix that or give an explanation on why this error is showing up.

achinsagar · November 28, 2023, 10:20am

Have you tried a timer to keep checking the status of the retrieval

Mosredna · November 28, 2023, 11:00am

Yes, it gave a response with data very specific to the document. (pdf with recipes from local butcher, and another one was a json with scraped data from a website)

ShisuiX · November 28, 2023, 1:24pm

I’m trying to follow your steps but the Assistant still doesn’t retrieve some info from the knowledge base

this is the code of my Assistant:

from time import sleep
from flask import Flask, request, jsonify, render_template, abort
import openai
import os

# Connect to OpenAI API
openai_api_key = os.environ.get('OPENAI_API_KEY')
if not openai_api_key:
    raise EnvironmentError("OPENAI_API_KEY is not set in environment variables.")

client = openai.OpenAI(api_key=openai_api_key)

# Start the Flask App
app = Flask(__name__)

@app.route('/')
def index():
    return render_template('index.html')

# Custom error handler
@app.errorhandler(500)
def internal_error(error):
    return jsonify({"error": "Internal server error"}), 500

# Function to create and return file ID from OpenAI
def create_file_for_openai(filename):
    try:
        file = client.files.create(file=open(filename, "rb"), purpose='assistants')
        return file.id
    except Exception as e:
        print(f"Error creating file '{filename}': {e}")
        abort(500)

# Read instructions from the file
try:
    with open('instructions.txt', 'r') as file:
        instructions_text = file.read().strip()
except IOError:
    print("Error: File 'instructions.txt' not found.")
    abort(500)

# Create files for ingestion
torontocondo_file_id = create_file_for_openai("torontocondo.json")  
restaurant_file_id = create_file_for_openai("restaurant.json")
places_file_id = create_file_for_openai("places.json") 
freetime_file_id = create_file_for_openai("freetime.json")
activities_file_id = create_file_for_openai("activities.json") 

# Create an Assistant with enhanced instructions and retrieval tool
try:
    assistant = client.beta.assistants.create(
        instructions=instructions_text,
        tools=[{"type": "code_interpreter"}, {"type": "retrieval"}],
        model="gpt-3.5-turbo-1106",
        file_ids=[torontocondo_file_id, restaurant_file_id, places_file_id, freetime_file_id, activities_file_id]) 
    print(f"Assistant Started with ID: {assistant.id}")
except Exception as e:
    print(f"Error creating assistant: {e}")
    abort(500)

# For setting up a new conversation
@app.route('/start', methods=['GET'])
def initiate_conversation():
    try:
        thread = client.beta.threads.create()
        return jsonify({"thread_id": thread.id})
    except Exception as e:
        print(f"Error initiating conversation: {e}")
        abort(500)

@app.route('/chat', methods=['POST'])
def chat():
    try:
        data = request.json
        thread_id = data.get('thread_id')
        user_input = data.get('message', '')

        # Send the user message to the thread
        client.beta.threads.messages.create(thread_id=thread_id, role="user", content=user_input)
        # Trigger a run with the assistant
        run = client.beta.threads.runs.create(thread_id=thread_id, assistant_id=assistant.id)

        while run.status not in ['completed', 'failed']:
            sleep(1)
            run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run.id)

        messages = client.beta.threads.messages.list(thread_id=thread_id, order="desc")
        # Filter messages to get the latest bot response
        bot_responses = [msg for msg in messages.data if msg.role != 'user']
        response = bot_responses[0].content[0].text.value if bot_responses else "No response found."
        return jsonify({"response": response})
    except Exception as e:
        print(f"Error during chat: {e}")
        abort(500)

# Start the server
if __name__ == '__main__':
    app.run(host='0.0.0.0', port=8080, debug=True)

Have I to modify something to improve the performances of the Assistant?
The knowledge bases and the instructions are well structured because I’m trying to fix this for weeks

xtww.weber · November 29, 2023, 10:06am

I guess I solved it, the association with an assistant is necessary

In Python:

with open(filename,'rb') as fp:
      fo = client.files.create(
              # the next line makes the filename available
               file = (filenam,fp),
                purpose = 'assistants',
               )
       # associate the file to an assistant
       client.beta.assistants.files.create(assistant_id=assistant_id,file_id=fo.id)

Mosredna · November 29, 2023, 5:59pm

When using this in a thread when a user uploads a file, this will add it to the assistant object itself permanently (visible in the web interface)
Now if multiple people are using the assistant with the api they can get info about other peoples file uploads, and you will reach max files fast. I don;t think that is something you want in most cases.

desijadoo · December 2, 2023, 3:12am

Yes the files will be visible through the web interface - but if there’s an ai application through which the users are interacting w/ the assistant - then the uploaded document is accessible to that user only (unless they somehow know what other documents have been uploaded and what their content is).
The files object is independent of the assistants and threads and that’s where the underlying issue is. One can associate files to assistants but not to threads. I think it should have been the other way around or both should have been possible (associate files with assistants and with threads. If its associated only with thread, then it’s like a session based file upload and some kind of garbage collector could clean that up once the thread is terminated. If its associated with assistant, then its persistent and available across all threads). What do you think?

Or How else would you suggest we handle this situation from application side? multiple people are interacting w/ the assistant and each wants to analyze their uploaded file. if the file doesn’t get associated to the assistant, it won’t be able to find it as everyone is reporting above.

desijadoo · December 2, 2023, 3:15am

Thanks a lot! I was struggling with this as well
@Mosredna has a valid point though.
I am not sure what the right approach should be.

Mosredna · December 2, 2023, 7:10am

For me it works just fine to add the file Id;s into the message in a thread.
Sometimes the AI says it can’t acces them, but you just tell the AI that it can

 const userMessage = await openai.beta.threads.messages.create(thread.id, {
            role: "user",
            content: message,
            file_ids
        })

I do have a job running that deletes the files after some time from my account.

ochapple · December 4, 2023, 11:54am

Hi @ShisuiX ,
Did you resolve some of the info not being retrieved from the kb?
Presumably either giving your AI Assitant instructions about delimiting info could help as I found that I needed to sometimes give the Prompt explicit details to decode a doc or I had to rebuild my doc in excel to remove all ambiguity to get 100% results. For instance the AI can’t always interpret a table design correctly even when you explain how to, so simplifying the data layout was required.

ruben.iberley · December 7, 2023, 9:40am

Hi, desijadoo, I like your approach very much, personally I think it is correct. I understand that if you add a file as an assintant user, it would not be included in the training data of the system.

But, from my own experience, I have problems getting the files to stay in my model after one day. I have created two assistants that have the same contents but differ in that I use different versions of their GPT engine.

In the one I am using version 4.0, every day I have to re-feed it with the files, however in version 3.5 the files are retained. I have tried to load them inside the playground, from the assistants interface and they are not retained… any suggestions?

desijadoo · December 7, 2023, 10:39am

@ruben.iberley That is strange behavior. I have created multiple assistants and I am able to upload files and associate it to the assistant and those files are retained within the assistant. I have only used GPT 4.0 Turbo (gpt-4-1106-preview) model.

Are you sure you are clicking on “Save” button in the Assistant after uploading the file in the playground?

I had made this silly mistake many times in the beginning where I uploaded the file and forget to hit ‘save’ and obviously the assistant wasn’t able to analyze the newly uploaded file.

ruben.iberley · December 7, 2023, 11:13am

Yes I have, in fact I usually save a couple of times just in case. I think the problem might be that the model was cloned from a previous assistant which had a different engine. But even so, I understand that when cloning a assistant, it should be cloned with the files associated with the previous one. But, in my case it is not like that.

desijadoo · December 7, 2023, 8:18pm

i tried cloning one of my assistants which had a file associated to it and the same was available in the clones assistant as well.

ruben.iberley · December 12, 2023, 11:23am

I don’t know what is going on, but this morning there was a repeat of the situation we talked about. One theory I have is that this is a security measure to prevent you from accessing someone else’s uploaded files.
The specific case is, someone else I’m working with added the files I want to replicate to another assistant that I want to mix that knowledge with my own.
Maybe this is the reason why I can’t properly clone someone else’s models. With the ones I made, the data is correctly dumped, so I guess it’s a security issue (?).

samsgriffen · January 10, 2024, 2:42am

Any updates on the myfiles_browser bug on assistants api?

jtstark · January 27, 2024, 12:47am

I’ve been struggling so hard with this and trying every suggestion I could find. This one finally seems to have worked lol, maybe.

But it’s giving me incorrect output even though the same prompt produces the correct result via the playground, so I am somewhat skeptical if it is actually reading the file or only pretending to.

Topic		Replies	Views
Assistant not able to access uploaded file API file-uploads	39	22331	January 29, 2025
Myfiles_browser tool is not operational for these files API	55	11036	April 5, 2024
Failed to update assistant: UserError: Failed to index file: Unsupported file type: application/csv Bugs playground	70	21403	February 27, 2024
File upload and acting on it in an assistant (v2) conversation API gpt-4 , file-uploads , assistants-files , assistants-uploads	20	2587	July 18, 2024
Assistant sometimes reply with: files you've uploaded are not accessible with the myfiles_browser tool API gpt-4 , api	12	7692	March 19, 2024

Assistant api, retrieval file api is not working

Related topics