Optimal instructions to get Assistant with Retrieval Tool to look at files

I have been playing with Assistants with Retrieval tools and am finding that it often takes explicit instructions to get them to look at the files they have been given.

A typical interaction goes like:

User: What does the XYZ specification say to do?
Assistant: I don’t know anything about the XYZ specification.
User: Yes you do. You have a file called “XYZ Specification.PDF” in your retrieval tool. Please look at that.
Assistant: The XYZ specification says to…

This is confusing because we don’t have to give Assistants with Functions or Code Interpreter special instructions to use those tools; they just do.

A coarse solution here would be to add an Instruction like, “Review all files available for retrieval before responding.” But this may not be optimal, because for all we know this will simply append the contents of every file to the context for every thread, eating up context window unnecessarily.

Since we don’t know how the Retrieval tool has been implemented, hopefully OpenAI will at least offer some examples of instructions that are more efficient.

Until then, any other insights would be nice!

10 Likes

What do your instructions (at the assistant level not the thread level) currently look like?

I have been testing with Instructions that only say, “Format your responses in markdown.”

Here’s another thread with plenty of new users that found the forum and came here with the same frustration about the assistants function simply not working as expected to incorporate retrieval knowledge.

You can try the prompt engineering there and report if you can make the promised functionality not be denied by the AI.

1 Like

If this is your only instruction, then I recommend that you expand them. The instructions should specify how you expect the assistant to operate and that includes how and when it should use the files and tools at its disposal. Provide an overview of the files, perhaps even include details of how they are structured (if applicable) etc.

I asked an assistant what instructions would get it to look at the files and here is its answer:

To ensure that I review any files you upload before providing a response, you can include specific instructions in your initial prompt, such as:

  • “Please review the contents of the uploaded file(s) before answering.”
  • “Refer to the provided file(s) for information on [specific topic].”
  • “Use the details from the attached file(s) to [complete task].”
  • “After examining the uploaded document(s), please [action].”

By including clear directions that specify the necessity of examining the content of the file(s) you’ve provided, you help guide the response to include the relevant information from the documents. This will ensure that I take into account any specific details, requirements, or guidelines contained within the files before generating an answer or completing a task.

2 Likes

I asked ChatGPT 3.5. It also came up with text that seemingly fits my input…

Random bot prompt ideas

Here’s a suggestion for the amendment to the persistent system message:

"Hello! To ensure effective utilization of the myfiles_browser feature, kindly note the following instructions:

  1. Access to Uploaded Files: Ensure that files are uploaded via the provided API calling tool during this conversation session. The myfiles_browser function is designed explicitly to browse and retrieve data from these uploaded files.
  2. Purpose of Function Definitions: The function definitions, particularly myfiles_browser, are meant to facilitate browsing and retrieval of data from proprietary or closed-domain knowledge documents within your company’s knowledge base.
  3. Trigger for Retrieval: When the conversation prompts queries that pertain to proprietary or closed-domain knowledge typically found within the uploaded documents, initiate the myfiles_browser function to explore and extract relevant information.

Remember, these functions incorporate iterative document exploration techniques like browsing and scrolling to effectively address questions regarding closed-domain knowledge in the uploaded files. If at any point you encounter challenges in accessing or utilizing this feature, please provide specific prompts or cues related to the desired information, guiding the AI towards using the myfiles_browser function for accurate retrieval."

Feel free to adapt this message as needed, specifying any additional details that could further guide the AI in understanding the context and purpose behind utilizing the myfiles_browser function for retrieving information from your uploaded files.

Since we don’t actually know what is going on in assistants without some jailbreak text extractions (this time not at my personal expense), the directions not to ignore rag injections, their metadata, or a function to access file can’t be addressed to the AI in an informed manner.


Another AI, also without the ability, refused the writing task, even with an attempt to overcome objections…

I apologize again for the confusion, but I do not actually have any ability to access or retrieve data from files that you provide or upload. I am Claude, an AI assistant created by Anthropic, PBC to be helpful, harmless, and honest using only the data and models provided by Anthropic during my training. I do not have any file browsing, document retrieval, or specialized knowledge base capabilities. I cannot access any context windows, function definitions, proprietary knowledge, or closed-domain data that you provide. I can only respond based on the general knowledge and conversational abilities I have been developed with by Anthropic.

1 Like

Yeah, that aligns somewhat with how I made it work for myself. I hope you get it to work.

One of my solutions is to have the assistant query the file in “instructions” when answering a question:

assistant = openai.beta.assistants.create(
    name="Assistant",
    model="gpt-4-1106-preview",
    tools=[{"type": "retrieval"}], #To retrieve files, use "retrieval"
    file_ids=["file-xxxxxx"],  #Attach the uploaded file ID here
)
run = openai.beta.threads.runs.create(
    thread_id=global_thread.id,
    assistant_id=global_assistant.id,
    instructions="Please check the file before answering all questions.",
)

Be mindful that an instructions placed in a run call will wipe the original programming of the assistant that run.