Which is the best option to send hundreds of data in openai api?

I have a .csv file with hundreds of data, I know that the best option is to make chunks and send to chatgpt, but I need all the context of it to ask about information that is matched.

How can I do it?, by each chunk I need to make an API call?.


Hi @BrokenSoul

If I understand correctly, you want AI to use you data to augment the model’s response to questions.

In that case I’d recommend the Assistants API.

As it does the whole RAG for you and requires minimal setup in case you aren’t familiar with using embeddings on your own for RAG.

I though that, but I need to retrieve information about a page and fill it in a .csv file and it can change very often, if I upload a file and replace it in my app each time, problems with sincronization will happen.

if I use embeddings in execution time the resources will be costs and low.

Interesting, as of now assistants doesn’t have web browsing capabilities but it’s supposed to be supported later on.

In the mean-time a code based solution that handles every aspect of question answering along with retrieval can be implemented and the data can be fetched from csv simply using code that’s generated by the model.

If you own the data source or if there’s an event handler dealing with data changes, you can directly update the csv file in real-time.

The application is still quite unspecified in scope.

A handful of varied examples and possible solutions.

“What is the top category that has the most unhappy customers”

  • data elements should be independently AI sentiment scored, for analysis by function

“What is the predominant theme of the day?”

  • Chunks can have entity extraction with totals of categories that AI can then handle all summaries.

“what add records also have both change records and delete records?”

  • Pretty much the AI needs to see all, or you need to put subcategories of records into groups

“what paragraphs of the book are most similar to ‘Tom Sawyer Huck Finn whitewash the fence’”

  • embeddings

Most of all, there are likely many things where the better answer will be by code and not language processing.