Assistant API: Analyzing with code interpreter a dataframe with long-text features

Chris60 · March 22, 2024, 2:24pm

Hello! I’m currently analyzing a dataframe that I’ve uploaded to the Assistant, engaging in a conversational analysis. However, I encounter a problem when I request summaries of features that contain long text.

For instance, the dataset pertains to books, with each entry representing a book and including details such as the title, ISBN code, publication date, theme, ranking, summary, and more. After engaging in a brief conversation, the Assistant has successfully filtered the dataset to include only books themed around vampires, with rankings over 4 stars, and published within the last two years, narrowing it down to two books.

When the user requests summaries of these two books, the Assistant fails to provide the summaries from the dataset for both books. I am uncertain if this issue arises because the summaries are too lengthy, or if there is a processing error within the Code Interpreter instance. The summaries exist within the dataset, and I would like the Assistant to present these two summaries to the user. I have encountered several issues when attempting this:

The Assistant provides a one-line summary for each book that lacks detail (e.g., “The book is about a kid”), even when the summary from the dataset contains between 500-1000 characters.
The Assistant generates inaccurate information when the summary is present in the dataset, a problem known as “hallucination.”

I would appreciate any suggestions on how to address this issue. Ideally, the Assistant should provide the exact summary from the dataset or perhaps a concise version of the summary feature.

Diet · March 22, 2024, 9:43pm

It’s probably not the issue, but this thread comes to mind:

what model are you using?

Chris60 · March 22, 2024, 10:29pm

3.5-turbo-1106 (I’m not currently using gpt-4 because of the cost)

Topic		Replies	Views
Extracting and summarizing text from filtered structured data using OpenAI Assistants + Code Interpreter API code-interpreter , assistants	0	937	March 20, 2024
Assistant API Providing Summary of Data Instead of Complete API api	5	1266	March 28, 2024
How should a program be written to summarize a long text using an API, and what are the considerations regarding the maximum number of tokens allowed? API	2	2462	April 19, 2024
Which model is less likely to truncate code in the Assistants API using code interpreter? API gpt-4 , api , code-interpreter , assistants-api	6	1046	March 8, 2024
The output of the assistants does not cover the requested word limits API gpt-4 , api , assistants-api	2	1573	January 22, 2024

Assistant API: Analyzing with code interpreter a dataframe with long-text features

Related topics