Seeking Assistance with GPT Turbo 4 for Data Cleaning and DataFrame Conversion

himanshu.3333 · March 11, 2024, 7:40am

Hi, I am trying to use GPT Turbo 4 for cleaning my 1 million rows. My data has around 5 columns, so just to be on the safer side, I don’t want to exhaust tokens. I plan to process 300 rows at a time in a batch.

My goal is to clean my data, addressing issues such as spelling mistakes and invalid characters. I am considering sending these 300 rows in a dictionary format to make it easier for me to create a dataframe later on.

However, I’m facing a challenge where each batch output I receive has different formats – some come in JSON, some with commas, etc. I have also attempted to modify the prompt, but I’ve had no luck. Now, since each batch output is in a different format, I am struggling to append them all.

My question is: Can I get the output of 300 rows in a way that it can be easily converted to a dataframe and lateron i could append all these batches into one dataframe

Topic		Replies	Views
Big CSV files restructuring/transformation using GPT API gpt-4 , chatgpt , fine-tuning , api	3	1190	December 17, 2023
Data File Analysis via API API gpt-4 , api	3	3288	March 5, 2024
What is the best way to upload datasets that exceed the token limit? API	3	1468	December 18, 2023
Can't fetch large data from external API calls despite GPT-4's ability to handle up to 128k Plugins / Actions builders gpt-4 , gpts	7	2696	November 19, 2023
Efficient Methods for Processing Large Volumes of Tabular Data with ChatGPT Prompting gpt-4	1	817	August 11, 2024

Seeking Assistance with GPT Turbo 4 for Data Cleaning and DataFrame Conversion

Related topics