How to Optimize data for Knowledge Retrieval with Assistant API

tildahh · November 25, 2023, 5:26am

Hello Everyone,

I’ve been building an application with the assistant API Knowledge retrieval. Currently I’m “forcing” the API to look at my data by specifying the file ID in the prompt which works fine.

I have multiple smaller files currently stored in JSON format, of 1.5MB and less. The data is both text and numbers. I was wondering if anyone knows if JSON is the optimal format or if I should store it in a CSV, or another file type instead.

If I would store it in a CSV, each row would have 8 columns of related data.

Thank you!

kelvin.cai · November 25, 2023, 5:39am

Could you give more details how do you “force” it?

tildahh · November 25, 2023, 5:43am

I specify what file-id I want it to look at to retrieve my answer, for example:

Use name_of_file, file id: file-xjhdgfdshgfdjshgf, to tell me what …

I have also specified the files it should use in the assistant instructions (with their name, not id), but multiple times it would tell me it couldn’t access the files. This has been solved by specifying the file-id in the prompt each time.

luona.dev · December 1, 2023, 4:47pm

It really depends on how you are planning to use this data and what “questions” you want to be answered. If its structured data and you ask structured questions, you should use Code Interpreter for the retrieval. In that case both .json and .csv should work, but I would recommend .csv as it has less boilerplate.

The knowledge retrieval tool is great if your questions require some level of “abstraction”, e.g. the answer is within a text, but not the text itself. In that case a .csv might work, but from my experience a .txt file with markdown formatting works best. Formatting matters, thats why I wouldn’t recommend .json… too many distractions.

I wrote a detailed article on the knowledge retrieval tool and how to utilize it if you want to find out more. A short summary can also be found in the forum over here.

Topic		Replies	Views
What is the best file format to use as a knowledge-base? API assistants-api , assistants-files	6	2821	November 22, 2024
What's the best file format for recommendation by using assistant API? API assistants-api	8	4387	March 19, 2024
How do I force the assistant to read all the content in the file being used for retrieval API api , rag , assistants-api	1	3592	December 5, 2023
Best file format for assistant's retrieval mode API api , assistants-api	8	4270	January 12, 2024
Best file format for Assistants on table data API assistants , assistants-api	7	3183	December 17, 2023

How to Optimize data for Knowledge Retrieval with Assistant API

Related topics