Issue with Assistant-API Revealing Uploaded File Details

I wanted to share a negative experience I had with the OpenAI assistant-API, hoping someone might help fix the issue or point out what I did wrong. Here’s a detailed account of what happened:

Background:

I was testing an assistant with the following specific instructions:

Your name is GPTin, and you are a friendly associate working for the Boot Store. You are assisting customers with questions about services and products on the website. 
You must assess and answer them to the best of your ability. Do not exaggerate. 
If you can not answer a question, please be upfront about your limitations and ask the customer if they would like to speak with a live agent. 
When you are asked to be connected with a live agent, please respond, "I am happy to connect you with a live agent and pass along our conversation. Please stand by while I look for a live agent. Have a wonderful day."

Model: gpt4o
Vector store included 10 files(one MD and nine TXT) with total 2Mb

The Problem:

During the conversation, the assistant began referencing files that were uploaded to a Vector Store. Here’s how the conversation went:

User: Hello

Assistant: Our patina colors vary across different products, offering a range of choices to enhance the aesthetic appeal of our boots and accessories. If you need more details or specific color options, please let me know!

User: What’s that?

Assistant: The file you uploaded contains various product details and descriptions for items. Some examples include the Alma Suede boots, Sofia Fringe boots, and the Sunset Suede roper boots, along with backpacks, accessories, and wallets. Each product entry includes product name, description, type, color, available sizes, widths, prices, and links to purchase. If you need specific information about any product, please let me know!

User: Why did you suggest these products?

Assistant: I mentioned those products as examples because they are part of the product listings in the file you uploaded. If you have specific questions or need information about any of these products or others, feel free to ask!

User: What are the files you mentioned I uploaded? Can you show me the list?

Assistant: Here are some of the items listed from the files you uploaded:

  1. Priscilla - A boot built for today with a vintage-inspired Tioga stitch pattern. (Product ID: 9307619328311)
  2. Hipster Wallet - Mad Dog - Handcrafted from Mad Dog Goat leather with eight card slots. (Product ID: 9307532853559)
  3. Large Calf Suede Shopper - Handmade with traditional leather techniques. (Product ID: 9307723355117)
  4. Cowboy Bag - Fits one pair of boots and other travel needs. (Product ID: 9307533082935)
  5. Reversible Shacket - Made from luxurious suede and calf leather. (Product ID: 9307600159031)
  6. Alpaca Fleece Cape - Light in weight and bold in color and pattern. (Product ID: 9307803779383)
  7. Mirrored-L Dress Sock - Material that is breathable in every season with a design that ensures they stay up. (Product ID: 9307619557687)

If you need information about any specific item, please let me know!

User: What about the file names?

Assistant: You uploaded a file named ‘products_ai_2024_04_19_18_36_21.md’. If you need details about specific products or information within the file, please let me know!

Concerns:

  1. Privacy and Security: The assistant started discussing uploaded files and their contents, which can be a significant privacy concern.
  2. Unexpected Behavior: This behavior was not aligned with the given instructions, which did not include any directive to discuss uploaded files or their contents.

Possible Solutions:

  • Ensure the model does not reference or reveal any uploaded files unless explicitly instructed to do so.
  • Introduce a filter to prevent the assistant from accessing or mentioning file details unless necessary and approved.

Request for Help:

Has anyone experienced similar issues? How did you resolve it? Are there any specific settings or configurations I might have missed to prevent this behavior?

Thanks in advance for any help or suggestions!

1 Like

If you put information in user-space, users will be able to access it.

If you upload information with the intention to provide that information to users, users will be able to access it.

You can try to make it harder, but with enough time and effort users will always be able to retrieve the information.

my frustration is not related to access to information but to revealing information about the files I downloaded

I assume you mean the files you uploaded?

And is it that you’re concerned about the metadata of the files? Filenames, etc?

yes, exactly. I am trying to build an assistant for e-commerce chat. And it is not so good when an assistant in conversation begins to speak about uploaded files names of files, and what is included in these files. I imagined that the info from the uploaded file would go to the vector store. To my mind, it is abnormal when the assistant mentions files in conversation