Thank you for sharing, will definitely try this out
Ran into the same issue, and this fix sounded too GPT to be true
So I tried it, and it works!
Have you verified that it actually get parts of the file into the context and not hallucinate a response when this bug fix occurs
Good point @filip.byren. OpenAi needs to fix that or give an explanation on why this error is showing up.
Have you tried a timer to keep checking the status of the retrieval
Yes, it gave a response with data very specific to the document. (pdf with recipes from local butcher, and another one was a json with scraped data from a website)
I’m trying to follow your steps but the Assistant still doesn’t retrieve some info from the knowledge base
this is the code of my Assistant:
from time import sleep
from flask import Flask, request, jsonify, render_template, abort
import openai
import os
# Connect to OpenAI API
openai_api_key = os.environ.get('OPENAI_API_KEY')
if not openai_api_key:
raise EnvironmentError("OPENAI_API_KEY is not set in environment variables.")
client = openai.OpenAI(api_key=openai_api_key)
# Start the Flask App
app = Flask(__name__)
@app.route('/')
def index():
return render_template('index.html')
# Custom error handler
@app.errorhandler(500)
def internal_error(error):
return jsonify({"error": "Internal server error"}), 500
# Function to create and return file ID from OpenAI
def create_file_for_openai(filename):
try:
file = client.files.create(file=open(filename, "rb"), purpose='assistants')
return file.id
except Exception as e:
print(f"Error creating file '{filename}': {e}")
abort(500)
# Read instructions from the file
try:
with open('instructions.txt', 'r') as file:
instructions_text = file.read().strip()
except IOError:
print("Error: File 'instructions.txt' not found.")
abort(500)
# Create files for ingestion
torontocondo_file_id = create_file_for_openai("torontocondo.json")
restaurant_file_id = create_file_for_openai("restaurant.json")
places_file_id = create_file_for_openai("places.json")
freetime_file_id = create_file_for_openai("freetime.json")
activities_file_id = create_file_for_openai("activities.json")
# Create an Assistant with enhanced instructions and retrieval tool
try:
assistant = client.beta.assistants.create(
instructions=instructions_text,
tools=[{"type": "code_interpreter"}, {"type": "retrieval"}],
model="gpt-3.5-turbo-1106",
file_ids=[torontocondo_file_id, restaurant_file_id, places_file_id, freetime_file_id, activities_file_id])
print(f"Assistant Started with ID: {assistant.id}")
except Exception as e:
print(f"Error creating assistant: {e}")
abort(500)
# For setting up a new conversation
@app.route('/start', methods=['GET'])
def initiate_conversation():
try:
thread = client.beta.threads.create()
return jsonify({"thread_id": thread.id})
except Exception as e:
print(f"Error initiating conversation: {e}")
abort(500)
@app.route('/chat', methods=['POST'])
def chat():
try:
data = request.json
thread_id = data.get('thread_id')
user_input = data.get('message', '')
# Send the user message to the thread
client.beta.threads.messages.create(thread_id=thread_id, role="user", content=user_input)
# Trigger a run with the assistant
run = client.beta.threads.runs.create(thread_id=thread_id, assistant_id=assistant.id)
while run.status not in ['completed', 'failed']:
sleep(1)
run = client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run.id)
messages = client.beta.threads.messages.list(thread_id=thread_id, order="desc")
# Filter messages to get the latest bot response
bot_responses = [msg for msg in messages.data if msg.role != 'user']
response = bot_responses[0].content[0].text.value if bot_responses else "No response found."
return jsonify({"response": response})
except Exception as e:
print(f"Error during chat: {e}")
abort(500)
# Start the server
if __name__ == '__main__':
app.run(host='0.0.0.0', port=8080, debug=True)
Have I to modify something to improve the performances of the Assistant?
The knowledge bases and the instructions are well structured because I’m trying to fix this for weeks
I guess I solved it, the association with an assistant is necessary
In Python:
with open(filename,'rb') as fp:
fo = client.files.create(
# the next line makes the filename available
file = (filenam,fp),
purpose = 'assistants',
)
# associate the file to an assistant
client.beta.assistants.files.create(assistant_id=assistant_id,file_id=fo.id)
When using this in a thread when a user uploads a file, this will add it to the assistant object itself permanently (visible in the web interface)
Now if multiple people are using the assistant with the api they can get info about other peoples file uploads, and you will reach max files fast. I don;t think that is something you want in most cases.
Yes the files will be visible through the web interface - but if there’s an ai application through which the users are interacting w/ the assistant - then the uploaded document is accessible to that user only (unless they somehow know what other documents have been uploaded and what their content is).
The files object is independent of the assistants and threads and that’s where the underlying issue is. One can associate files to assistants but not to threads. I think it should have been the other way around or both should have been possible (associate files with assistants and with threads. If its associated only with thread, then it’s like a session based file upload and some kind of garbage collector could clean that up once the thread is terminated. If its associated with assistant, then its persistent and available across all threads). What do you think?
Or How else would you suggest we handle this situation from application side? multiple people are interacting w/ the assistant and each wants to analyze their uploaded file. if the file doesn’t get associated to the assistant, it won’t be able to find it as everyone is reporting above.
Thanks a lot! I was struggling with this as well
@Mosredna has a valid point though.
I am not sure what the right approach should be.
For me it works just fine to add the file Id;s into the message in a thread.
Sometimes the AI says it can’t acces them, but you just tell the AI that it can
const userMessage = await openai.beta.threads.messages.create(thread.id, {
role: "user",
content: message,
file_ids
})
I do have a job running that deletes the files after some time from my account.
Hi @ShisuiX ,
Did you resolve some of the info not being retrieved from the kb?
Presumably either giving your AI Assitant instructions about delimiting info could help as I found that I needed to sometimes give the Prompt explicit details to decode a doc or I had to rebuild my doc in excel to remove all ambiguity to get 100% results. For instance the AI can’t always interpret a table design correctly even when you explain how to, so simplifying the data layout was required.
Hi, desijadoo, I like your approach very much, personally I think it is correct. I understand that if you add a file as an assintant user, it would not be included in the training data of the system.
But, from my own experience, I have problems getting the files to stay in my model after one day. I have created two assistants that have the same contents but differ in that I use different versions of their GPT engine.
In the one I am using version 4.0, every day I have to re-feed it with the files, however in version 3.5 the files are retained. I have tried to load them inside the playground, from the assistants interface and they are not retained… any suggestions?
@ruben.iberley That is strange behavior. I have created multiple assistants and I am able to upload files and associate it to the assistant and those files are retained within the assistant. I have only used GPT 4.0 Turbo (gpt-4-1106-preview) model.
Are you sure you are clicking on “Save” button in the Assistant after uploading the file in the playground?
I had made this silly mistake many times in the beginning where I uploaded the file and forget to hit ‘save’ and obviously the assistant wasn’t able to analyze the newly uploaded file.
Yes I have, in fact I usually save a couple of times just in case. I think the problem might be that the model was cloned from a previous assistant which had a different engine. But even so, I understand that when cloning a assistant, it should be cloned with the files associated with the previous one. But, in my case it is not like that.
i tried cloning one of my assistants which had a file associated to it and the same was available in the clones assistant as well.
I don’t know what is going on, but this morning there was a repeat of the situation we talked about. One theory I have is that this is a security measure to prevent you from accessing someone else’s uploaded files.
The specific case is, someone else I’m working with added the files I want to replicate to another assistant that I want to mix that knowledge with my own.
Maybe this is the reason why I can’t properly clone someone else’s models. With the ones I made, the data is correctly dumped, so I guess it’s a security issue (?).
Any updates on the myfiles_browser bug on assistants api?
I’ve been struggling so hard with this and trying every suggestion I could find. This one finally seems to have worked lol, maybe.
But it’s giving me incorrect output even though the same prompt produces the correct result via the playground, so I am somewhat skeptical if it is actually reading the file or only pretending to.