Hi there,
We are trying to use the Assistants API to query data which is in a table form. We have tried multiple options such as .txt files containing comma separated results, .md files containing Markdown tables. Some questions asked of the assistant come back fine but a decent portion are completely incorrect and are pulling data from the wrong places.
Has anyone had any success using Assistants to query table data?
Thanks.
1 Like
Yes, I am using it through the API. Are you using it through UX? Have you tried 3.5 vs 4 as the engine for the Assistant? I am using CSV tabular data, still experimenting with it, but I agree so far it has been hit or miss whether it actually grabs the file or not.
Weâre using it both via the UI and also the API (depending on test cases). Weâre using 4 for the engine of the Assistant, have not tried 3.5 but presumed itâd perform worse.
1 Like
Can you confirm that your assistant (when retrieved through python or CLI) shows the file_id attached? something similar to this image. It should be the case if you have already added it in the UI.
Yeah I can confirm, our assistant does show the file ID when fetched via the CLI
1 Like
Yeah I am not sure, I also have issues with the assistant recognizing the file. I know thereâs the option to add the file to the message itself, but I donât think we should need to do that. I have tried to be explicit with the file name, but that shouldnât need to be the case as the front-end user canât be expected to do that. Iâm sure this will be improved soon.
1 Like
Hi Brandon,
I went through the same issues a few weeks ago while feeding a json file containing a list of customers. I made a number of tests and came to the conclusion that the model canât just be used by feeding it a mass of data in the hope Iâll be able to query it in natural language. It feels like the model has human like limitations (lazyness and untrustable memory ). However, the model is great at programming so what I ended up doing is asking it to create a SQL query I can then pass to my DB to retreive the exact information Iâm looking for.
So instead of loading my full customer list to the model and query it for the number of records I have (this generates random results), I asked the model to create a SQL query to retrieve that info after giving it the column and table names Iâm using. After asking the number of clients, I now get something like [RequiredActionFunctionToolCall(id=âcall_rUqU2RzOrCâ, function=Function(arguments=â{âquery_stringâ:âSELECT COUNT(*) FROM Clientsâ}â, name=âexecuteQueryâ), type=âfunctionâ)]. Thatâs just an example since I can now query basically any available information on specific records or the whole table.
Hope this helps
Pascal
2 Likes
Clever approach but I do not think this will work âin-memoryâ use cases?