Questions about File Search on assistants

I implemented assistants through their API. But so far we are facing issues reading uploaded txt files. This is related to car dealerships.

  1. Sometimes it responds that it could not find the vehicle in question.
  2. Sometimes it responds with wrong information and if you say check again if finds the information finally.
  3. Sometimes it pulls information that is not already on files (likely it got memory from previously deleted files)

Apart from that sometimes it does not call the functions added but responds on its own. Example make an appointment.

Basically the on-create process is as follow:

  1. Files are uploaded
  2. Vector Store is created and files associated
  3. Assistant is created and vector, functions are associated.

The on-edit process is as follow:

  1. New files are uploaded (in case there is a new file)
  2. Existing vector is updated with new files and existing ones that are not going to be deleted (using batch). Previously I was deleting the existing vector and creating a new one but that was worse. In case there are no new files and nothing to delete the vector remains untouchable.
  3. Assistant is updated and its vector as well
  4. Files no longer needed are eliminated.

Whenever asked about the instructions is responding pretty well but once asked about any information in the file this is only 50% accurate.

The txt file just has a simple column text where we are listing our inventory with one empty line in between the records.

Stock:
Year:
Make:
Model:
etc

Stock:
Year:
Make:
Model:
etc

General Info for set up:
Model: gtp4-o
Temperature: 0.9
Chunking strategy: auto

Context:
Whenever we are about to start a brand new conversation I create a thread and an initial message on behalf of the customer to let the AI know what the conversation is about, example: “I am **** and looking for a 2021 Toyota Corolla”. Then I run the AI which has to gather information about the car on file and create the next message for the customer.

  1. You probably need to define that vehicle and other vehicles as a dictionary with it’s own unique attributes. For instance you could use Python:
vehicles = {
    "12345": {"Year": 2021, "Make": "Toyota", "Model": "Corolla"},
    "67890": {"Year": 2020, "Make": "Honda", "Model": "Civic"}
}
  1. Ensure that before it writes something, it checks the entire txt file before writing.
  2. Try limiting it from answering any other question. Try telling it to only answer questions you can find data for in the text file.
  3. You could implement the above steps all using Python, if you need some code, then ask me in a follow-up reply.

As for the process:
Everything seems fine, but you mentioned the txt file being structured that way. Well I would suggest using some sort of “csv” format, ChatGPT is likely to read it better (I think). Since csv files are comma separated each row will provide information about a specific car and its corresponding information, much better than having it all separated by empty lines, which could confuse it.

So here are my suggestions for the process:

  1. Try using CSV instead of txt: I’ve already suggested Python, but you could also use CSV instead, where you have each row representing a car model. This might be easier for the AI to read, since it has some pandas library capability. Making it much easier to read CSVs.
  2. Use Python if you want. If you need some additional codes, just ask.

Hopefully this helps.

Thanks for your reply. I tried implementing markdown tables in the txt file. During test it was responding fine but we found that gtp-3.5* did not understand this format. Basically we have several assistants with different models accessing the txt file and we cannot move all to gtp-4*.

Maybe using JSON files for file-search tool should work? is JSON files understandable for gtp-3.5*??

Additionally I could go back to markdown tables and use GPT-4o mini for the current gtp-3.5* assistants because this is super cheap.

We are using nodeJS.

1 Like

Thanks for the reply too. You are correct, there is a chance that GPT-3.5 may not be verifying the information and comparing it to the actual data in the .txt file. I’m not sure if JSON works well, because I haven’t tried it. Also you could use GPT-4o mini, which is more reliable than 3.5. Also you could try CSV like I’ve suggested.

1 Like