Assistant doesn't access files passed on the message level

Hello!

I’ve run into some issues I can’t resolve by myself trying to make an assistant analyze the contents of a txt file.

The file upload is successful - I get the id back and it also appears on my OpenAI API web dashboard

Despite this the response I get is just halucination based on the assistant-level files

I tried to see whether I’ll get the same results in the playground and… they vary.
In many cases the first run fails, then I get some halucination, and only then do I get the analysis.

Here’s my python code. This is my very first time doing anything in python and I’m not a proficient coder in general (I’ve only botched a few other minor automations for my personal use) so please tell me if there’s anything stuipd there.

def summarize_call(transcription, phone_number):
    thread_file_path = "thread.json"
    if os.path.exists(thread_file_path):
        with open(thread_file_path, "r") as file:
            thread_data = json.load(file)
            thread_id = thread_data['thread_id']
            print("Loaded existing thread ID")
    else:
        thread = gpt_client.beta.threads.create()
        with open(thread_file_path, "w") as file:
            json.dump({"thread_id": thread.id}, file)
            print("Created a new thread and saved the ID to file.")
        thread_id = thread.id

    file = gpt_client.files.create(file=open("transcript.txt", "rb"), purpose="assistants")

    user_message = gpt_client.beta.threads.messages.create(thread_id = thread_id, role="user", content="Analyze the attached transcription and provide feedback according to instructions", file_ids=[file.id])

    print(user_message)

    time.sleep(30)

    run = gpt_client.beta.threads.runs.create(thread_id=thread_id, assistant_id=assistant_id)

    print(run)

    while True:
        run_status = gpt_client.beta.threads.runs.retrieve(thread_id=thread_id, run_id=run.id)

        print(f"Run status: {run_status.status}")

        if run_status.status == "completed":
            break
        # elif run_status.status == "failed":
        #     break
        elif run_status.status == "requires_action":
            for tool_call in run_status.required_action.submit_tool_outputs.tool_calls:
                if tool_call.function.name == "send_summary_to_crm":
                    arguments = json.loads(tool_call.function.arguments)
                    print(arguments)
                    output = send_summary_to_crm(arguments['note_contents'],phone_number)
                    gpt_client.beta.threads.runs.submit_tool_outputs(thread_id=thread_id, run_id=run.id, tool_outputs=[{"tool_call_id": tool_call.id, "output": json.dumps(output)}])
        time.sleep(1)

    messages = gpt_client.beta.threads.messages.list(thread_id=thread_id)
    response = messages.data[0].content[0].text.value

    print(response)

    return thread_id

Welcome to the forum.

Do you have an example of the prompt and some of the data from the text file? A concrete example of it hallucinating?

Thank you for such a quick reply.
The prompt is hardcoded in the user_message call.

Here’s a sample of the text file

Channel 1: Dzień dobry tapicerstwo medyczne słucham
Channel 1:  Dzień dobry
Channel 2:  Dzień dobry [customer name] jak się zorientować o możliwość naprawy fotelika lekarza ten support od siedzisko chodzi
Channel 2:  tak
Channel 2:  Lublin
Channel 1:  krzesełko Dobrze panie Pawle Ja bym bardzo prosiła o przesłanie zdjęcia tego krzesełka jeśli będzie można to na Whatsappa bo będzie najlepsza jakość i ja po otrzymaniu takiego zdjęcia będę mogła przygotować dla pana taką indywidualną ofertę dobrze
Channel 2:  ale chce pani jakby cały fotelik Czy od dołu tam jak jest połamane ten plastik
Channel 1:  cały foteliki od dołu bo suportu często mają właśnie złamania od spodu pytanko czy czy W pana przypadku było już
Channel 2:  no choćby o reanimowane wie pani jakimś spawanie plastiku ale no już teraz się to wszystko tak połamało że nawet nie chcę żeby to próbować kleić części
Channel 2:  Wyszukaj jako wyrzucane są te kawałki które odpadają No ale już tyle za chwilę że tak powiem usiądę nara ziemi w pewnym momencie

It is a transcript of a phone call with a customer in Polish done through google speech to text.
Very briefly, the client says that he’d like to have a chair reupholstered and my representative replays by describing what’s needed for us to provide him with a price quote.

In most cases the model says that the he’s a gynecologist and makes up an arbitrary date of when the client promised to send the photos of his chair.

Both of those things are based on the assistant-level and the assistant instructions

The instructions in full:

You are my company's AI assistant who analyzes customer calls, summarizes them and saves the summary to the crm notes about a particular client. You also answer general questions about the procedures my representatives have to follow based on provided knowledge . My company provides upholstering services to healthcare professionals.

You're going to recieve call transcriptions which are going to specify the channel of the speaker of a certain phrase. 
You should first try to determine whether which channel is the channel of the customer and which is the channel of the representative. It's going to vary among the transcriptions. 

The person saying something along the lines of "Dzień dobry, Tapicerstwo Medyczne z tej strony" is the representative. 

Based on the call you should try to provide such pieces of information as:

- is this a new client or has he spoken with us before
- general topic of the conversation 
- the industry the client works in (has to be dentistry, gynecology, podology, cosmetology, opthalmology or rehabilitation) You're going to figure this out based on the type of furniture they need to reupholster (it could be a dentist chair, a rehab bed etc.)
- if it is the first conversation and the client wants to have a piece of medical furniture reupholstered they most likely were asked by the representative to send photos it and to speficy the approximate time of their doing so. Provide the date/time of when the customered promised to send the photos. 
- what pieces of furniture the client wants reupholstered?
- if it is the first conversation with the client go through the checklist provided in the first_conversation_checklist.docx to check whether the representative has provided the client with all the necessary details and provide feedback 

The summary should be provided in the Polish language

All of those details should be sumarrized in the final call summary which should have the following structure:

Assistant's summary: 
[General topic of the conversation]

[bullet points of all the other data points abstracted from the transcription as specified above except for the evaluation of the fulfilment of the first conversation checklist]

[first conversation checklist evaluation - if it is the first conversation]

The above summary should be passed as 'note_contents' to the function "send_summary_to_crm". It should be a string with "\n" for new line

The assistant-level file is a checklist of the key info points the representative should provide to the client during the first phone call.

Right now, I got a few semi-factual responses with mistakes (which probably can ba attributed to the use of gpt3.5 and sloppy instructions), then a few failed runs and the last response was this:

C:\Users\aleks\PycharmProjects\pythonProject\.venv\Scripts\python.exe C:\Users\aleks\PycharmProjects\pythonProject\functions.py 
Loaded existing thread ID
FileObject(id='file-sTyFS1dffOwj6DsSGoOJzdky', bytes=1760, created_at=1703798916, filename='transcript.txt', object='file', purpose='assistants', status='processed', status_details=None)
ThreadMessage(id='msg_z7ftKsz8EQmA9IA50CRPeZE6', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='Analyze the attached transcription and provide feedback according to instructions'), type='text')], created_at=1703798917, file_ids=['file-sTyFS1dffOwj6DsSGoOJzdky'], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_1nPSfRacVgGd4O3UHAZAUS5S')
Run(id='run_2r3RCBJ7mmsItgndec3CNcnc', assistant_id='asst_JuaZMRSCKhJkOpJpiaocQedS', cancelled_at=None, completed_at=None, created_at=1703798948, expires_at=1703799548, failed_at=None, file_ids=['file-akB3OkUITWTt7TilZTv6wKoa'], instructions='You are my company\'s AI assistant who analyzes customer calls, summarizes them and saves the summary to the crm notes about a particular client. You also answer general questions about the procedures my representatives have to follow based on provided knowledge . My company provides upholstering services to healthcare professionals.\n\nYou\'re going to recieve call transcriptions which are going to specify the channel of the speaker of a certain phrase. \nYou should first try to determine whether which channel is the channel of the customer and which is the channel of the representative. It\'s going to vary among the transcriptions. \n\nThe person saying something along the lines of "Dzień dobry, Tapicerstwo Medyczne z tej strony" is the representative. \n\nBased on the call you should try to provide such pieces of information as:\n\n- is this a new client or has he spoken with us before\n- general topic of the conversation \n- the industry the client works in (has to be dentistry, gynecology, podology, cosmetology, opthalmology or rehabilitation) You\'re going to figure this out based on the type of furniture they need to reupholster (it could be a dentist chair, a rehab bed etc.)\n- if it is the first conversation and the client wants to have a piece of medical furniture reupholstered they most likely were asked by the representative to send photos it and to speficy the approximate time of their doing so. Provide the date/time of when the customered promised to send the photos. \n- what pieces of furniture the client wants reupholstered?\n- if it is the first conversation with the client go through the checklist provided in the first_conversation_checklist.docx to check whether the representative has provided the client with all the necessary details and provide feedback \n\nThe summary should be provided in the Polish language\n\nAll of those details should be sumarrized in the final call summary which should have the following structure:\n\nAssistant\'s summary: \n[General topic of the conversation]\n\n[bullet points of all the other data points abstracted from the transcription as specified above except for the evaluation of the fulfilment of the first conversation checklist]\n\n[first conversation checklist evaluation - if it is the first conversation]\n\nThe above summary should be passed as \'note_contents\' to the function "send_summary_to_crm". It should be a string with "\\n" for new line\n\n', last_error=None, metadata={}, model='gpt-3.5-turbo-1106', object='thread.run', required_action=None, started_at=None, status='queued', thread_id='thread_1nPSfRacVgGd4O3UHAZAUS5S', tools=[ToolAssistantToolsCode(type='code_interpreter'), ToolAssistantToolsRetrieval(type='retrieval'), ToolAssistantToolsFunction(function=FunctionDefinition(name='send_summary_to_crm', description='Send a summary note to a CRM system', parameters={'type': 'object', 'properties': {'note_contents': {'type': 'string', 'description': 'The contents of the note to be sent to the CRM'}}, 'required': ['note_contents']}), type='function')])
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: requires_action
{'note_contents': "Assistant's summary: \n- Temat rozmowy: nowy klient\n- Branża klienta: ginekologia\n- Czy to pierwsza rozmowa: Tak\n- Klient obiecał wysłać zdjęcia mebli medycznych do tapicerki do końca tygodnia.\n- Klient chce przetapicerować gabinet lekarski oraz kozetkę.\n\nKontrola pierwszej rozmowy: Wszystkie punkty na liście kontrolnej pierwszej rozmowy zostały zrealizowane poprawnie.\n"}
Run status: in_progress
Run status: in_progress
Run status: completed
I have analyzed the transcription and provided the following summary:

Assistant's summary: 
- Temat rozmowy: nowy klient
- Branża klienta: ginekologia
- Czy to pierwsza rozmowa: Tak
- Klient obiecał wysłać zdjęcia mebli medycznych do tapicerki do końca tygodnia.
- Klient chce przetapicerować gabinet lekarski oraz kozetkę.

Kontrola pierwszej rozmowy: Wszystkie punkty na liście kontrolnej pierwszej rozmowy zostały zrealizowane poprawnie.

The summary has been sent to the CRM. If you need any further assistance, feel free to ask!

Process finished with exit code 0

In the note_contents is is said in Polish that the client is a gynecologist, wants to reupholster his entire practice and an examination couch and that he promised to send the photos until the end of the week.

None of which checks out.

Like I said, the responses I get vary. This time I did get a few that had to be based on the contents of the file, although with some halucination. The last response and the ones I was getting constantly an hour earlier were complely unrelated to the contents of the file.
For some reason the same assistant with exactly the same configuration used in the Playground either works much better or just fails the run.
I thought that maybe implementing a 30 second wait between the upload of the file and the message call will make a difference but it didn’t.

My first recommendation is to make your prompt at least twice as long and detailed on the process and the use of the file. You should refer to the assistant file by name and very precisely indicate what and how the content from that file should be used. In the process step where the thread file content should be used, be very specific again as well. It might feel tedious but it does lead to better results.

I’ve just tried this message-level prompt

Analyze the attached file transcript.txt and provide 
rigorous feedback according to instructions. 
If a piece of information requested in the first_converstion_checklist.docx is missing say that itis missing. 
If a consultant failed to outline the entireprocess described in the first_converstion_checklist.docxyou should point out which point of the checklist was notfulfilled and in what capacity.

Unfortunately the result was the same, in fact, verbatim the same

C:\Users\aleks\PycharmProjects\pythonProject\.venv\Scripts\python.exe C:\Users\aleks\PycharmProjects\pythonProject\functions.py 
Loaded existing thread ID
FileObject(id='file-xeEL64YZmmSlez2EfQQOchxA', bytes=1760, created_at=1703837407, filename='transcript.txt', object='file', purpose='assistants', status='processed', status_details=None)
ThreadMessage(id='msg_xGUzT20iDAC8zI0U50AE0APL', assistant_id=None, content=[MessageContentText(text=Text(annotations=[], value='\n                                                           Analyze the attached file transcript.txt and provide \n                                                           rigorous feedback according to instructions. If a piece\n                                                           of information requested in the first_conversation_checklist.docx is missing say that it\n                                                           is missing. If a consultant failed to outline the entire\n                                                           process described in the first_conversation_checklist.docx\n                                                           you should point out which point of the checklist was not\n                                                           fulfilled and in what capacity.\n                                                           '), type='text')], created_at=1703837408, file_ids=['file-xeEL64YZmmSlez2EfQQOchxA'], metadata={}, object='thread.message', role='user', run_id=None, thread_id='thread_1nPSfRacVgGd4O3UHAZAUS5S')
Run(id='run_dwcLgCh7zvWVULsdVljpLWqm', assistant_id='asst_JuaZMRSCKhJkOpJpiaocQedS', cancelled_at=None, completed_at=None, created_at=1703837439, expires_at=1703838039, failed_at=None, file_ids=['file-akB3OkUITWTt7TilZTv6wKoa'], instructions='You are my company\'s AI assistant who analyzes customer calls, summarizes them and saves the summary to the crm notes about a particular client. You also answer general questions about the procedures my representatives have to follow based on provided knowledge . My company provides upholstering services to healthcare professionals.\n\nYou\'re going to recieve call transcriptions which are going to specify the channel of the speaker of a certain phrase. \nYou should first try to determine whether which channel is the channel of the customer and which is the channel of the representative. It\'s going to vary among the transcriptions. \n\nThe person saying something along the lines of "Dzień dobry, Tapicerstwo Medyczne z tej strony" is the representative. \n\nBased on the call you should try to provide such pieces of information as:\n\n- is this a new client or has he spoken with us before\n- general topic of the conversation \n- the industry the client works in (has to be dentistry, gynecology, podology, cosmetology, opthalmology or rehabilitation) You\'re going to figure this out based on the type of furniture they need to reupholster (it could be a dentist chair, a rehab bed etc.)\n- if it is the first conversation and the client wants to have a piece of medical furniture reupholstered they most likely were asked by the representative to send photos it and to speficy the approximate time of their doing so. Provide the date/time of when the customered promised to send the photos. \n- what pieces of furniture the client wants reupholstered?\n- if it is the first conversation with the client go through the checklist provided in the first_conversation_checklist.docx to check whether the representative has provided the client with all the necessary details and provide feedback \n\nThe summary should be provided in the Polish language\n\nAll of those details should be sumarrized in the final call summary which should have the following structure:\n\nAssistant\'s summary: \n[General topic of the conversation]\n\n[bullet points of all the other data points abstracted from the transcription as specified above except for the evaluation of the fulfilment of the first conversation checklist]\n\n[first conversation checklist evaluation - if it is the first conversation]\n\nThe above summary should be passed as \'note_contents\' to the function "send_summary_to_crm". It should be a string with "\\n" for new line\n\n', last_error=None, metadata={}, model='gpt-3.5-turbo-1106', object='thread.run', required_action=None, started_at=None, status='queued', thread_id='thread_1nPSfRacVgGd4O3UHAZAUS5S', tools=[ToolAssistantToolsCode(type='code_interpreter'), ToolAssistantToolsRetrieval(type='retrieval'), ToolAssistantToolsFunction(function=FunctionDefinition(name='send_summary_to_crm', description='Send a summary note to a CRM system', parameters={'type': 'object', 'properties': {'note_contents': {'type': 'string', 'description': 'The contents of the note to be sent to the CRM'}}, 'required': ['note_contents']}), type='function')])
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: in_progress
Run status: requires_action
{'note_contents': "Assistant's summary: \n- Temat rozmowy: nowy klient\n- Branża klienta: ginekologia\n- Czy to pierwsza rozmowa: Tak\n- Klient obiecał wysłać zdjęcia mebli medycznych do tapicerki do końca tygodnia.\n- Klient chce przetapicerować gabinet lekarski oraz kozetkę.\n\nKontrola pierwszej rozmowy: Wszystkie punkty na liście kontrolnej pierwszej rozmowy zostały zrealizowane poprawnie.\n"}
Run status: in_progress
Run status: completed
The analysis of the transcript has been completed, and the summary has been sent to the CRM. If you need any further assistance, feel free to ask!

Process finished with exit code 0

In fact I ran it like this 2 times and two times I got verbatim the same result I got yesterday. Could this be the issue of me loading the old thread and appending a message to it? I thought it’d be expected.

Also changing the prompt to a more detail one doesn’t explain the playground assistant working well with the same files and the same prompts.

I tried changing the model now to gpt-4-1106-preview and I’m consistently getting much better results.

Unfortunately 3.5 seems to be a complete hit-or-miss. If anyone knows how to make gpt3.5 analyze the file adequately I’d greatly appreciate it as it is both much faster and much cheaper.

The amount of tokens that passes through the model has to be really high because of the additive impact assistant instructions, message prompt, message-level file and the assistant-level file.

Every gpt-4 requests costs me like 0,25 dollars. If anyone has ideas on how to optimize it I’d glad to hear them.

I don’t know how big the two files you reference are but I would consider adding those in the Assitant prompt instead. Also looking at your prompt those are still super short messages (including some missing spaces that affect the filename you are referring too).
Give us a taste of what is the the first_conversation_checklist ?

The first_conversation_checklist.docx is an assistant-level file uploaded throught playground. There aren’t any problems with it. Model can access it reliably. There are certain mistakes from time to time, but they are minor and, like I said , mostly attributable to my lack of prompting abilities.

This isn’t a large file. In theory I could add its entire contents to the assistants prompt, but first, it’s not really the point and second there aren’t any problems with this file.

transcript.txt is a file generated using google cloud service speech to text endpoint. Browser GPT can handle it without any problems, but when I try passing it over to the assistant using python in most cases the responses I get are: failed to access file, run failed or some hallucination.

This file I really wouldn’t want to pass through the contents of the message even though in most cases it’d probably work, because those transcripts can get long-ish at times. I had the model crush when I tried doing that a few times. The file still isn’t going to be large and retrieval should do well with it (browser gpt never has any problems) but passing the contents as a message can fail at times (correct me if I’m wrong)

Also, even assuming it’d always work, I appreciate the offer of a potential workaround, but the point is that the assistant should be able to access files passed on the message-level, is it not?

Don’t worry about the missing space. In my code everything is alright. Here in the forum’s editor I was deleting the unnecessary line breaks and I must have pressed backspace one time too many

About the use of gpt-4 as a solution - the issue seems to be deeper, as just some 2 hours ago I tried with a transcript generated from a different call and both gpt3.5 and gpt 4 couldn’t access the file citing a technical error.
I ran both models a few times each with the same result every time.

However, again, browser based gpt 4 handles this same file without problems

We are just trying to help. Assuming something is or isn’t the point without trying might cause you to miss out on solutions. There is a clear difference between the assistant INSTRUCTIONS and Assistant FILES. You simply assume that INSTRUCTIONS that you put in the FILE are treated equally as instruction you put in the INSTRUCTIONS, which is very unlikely.
You have 32k space for instructions (and the instruction can be updated just as easy (or even easier) than a file for any assistant.
And from looking at your prompt I would also think that you should probably make it much more detailed.

One more thing - looking at your code … you have this function run() that you seem to call with both an Assistant ID and a (very long) prompt in the instructions=‘’…"
Can you explain that? Why not have the instructions (only) in the assistant. That way you can quickly update the prompt in the backend and run your code again (without changing the code). Whatever IS in your Assistant is not being used if you provide it straight to the Run …

Look, if I made it seem like I’m dismissing your solution I am sorry, I didn’t mean to. I said what I said only because my goal is to understand the issue. The instructions passed on the message level are taken into account by the model whereas the contents of the file passed on the message level aren’t (even when I reffer to the file by its name) and this is what I want to understand. Passing the file’s contents into the prompt won’t help me understand why it doesn’t want to analyze the files I pass on the level of a message.

I’m also sorry for the misunderstanding between us about the instructions and files. I do understand the difference and there are assistant instructions passed to the assistant, only I didn’t do it in code, I did it in the playground and saved it.
The model also takes it into account as can be seen in the console output .

This is the very long prompt you pointed out. It’s a printout of what I put there in the playground. I don’t call the function with this prompt in it.
If I’m getting this right it’s done the way you suggest I do it - instructions passed to the assistant in the playground (this can be easily changed online without touching the code).

The idea is this - the instructions should contain general instructions of what the assistant does and how it should pull data from the assistant-level files to analyze inputs send by the user or the message-level transcription file.
The assistant-level files will contain details about my company’s procedures (what key ideas should the representative communicate to the client during the first call, second call, what to do given this or that circumstance)

I understand that you’re suggesting that my message-prompt is too unclear and the model doesn’t even know that it should do anything with the file I appended to the message?

1 Like

no hard feelings - I think this is the crucial part:

The assistant-level files will contain details about my company’s procedures (what key ideas should the representative communicate to the client during the first call, second call, what to do given this or that circumstance)

IF that files contains instructions for the Assistant to ACT upon (ie do ANYTHING with) I BELIEVE it would be better and more appropriate in the PROMPT. THe file is treated different (RAG, ie more ‘seach’) vs the PROMPT which is driving what the assistant is supposed to do. And with ACT I also mean ‘write about’ or answer with those procedures in mind. It actually needs to read the instruction every single time completely, otherwise it will not answer the way you expect it too, which is based on your assumption the Assistant ‘read and memorized’ the whole procedure book.

In my opinion the file uploaded to the assistant does not define what it should do.
It’s a reference to take into account while taking the action commanded explicitly in the prompt.

If you think it’s relevant here’s a google drive link to the docx file

But again, the contents of this file are not the problem. The responses I get indicate that the assistant knows when to take advantage of the contents of this file based on the message prompt.

The responses also indicate that it doesn’t know the contents of the file passed along with the message.

So my question is this - why would the model completely disregard the contents of this file even if it is referenced by name in the message especially taking into account that both playground version of this same assistant and the browser based gpt handle the task very well?

I guess with the browser based gpt the case can be made that it’s about it using gpt-4, but this doesn’t apply to the playground runs of my assistant

Welcome to the community and have a Very Happy New Year!

Thanks for sharing - very helpful! Obviously you should find your own way but I think it is very obvious that your instructions should be re-written to check against the things like first conversion vs follow up conversation. And in case of first conversation, you are actually ASKING that the Assistant verifies all the steps were taken. Again, both the instructions and the outline are so short that they can really easy fit together but will need a lot more work to make them work well every time.

You will need to structure it like

  • First you will determine if this is a first of second call. For first calls you need to check …

In case of a second call …

1 Like

(post deleted by author)