I’ve been trying to use ADV-Data-Analytics to feed GPT4 with three datasets in the form of simple CSV files containing approx 14k rows and less than 10 columns each. The goal was for it to manipulate and join the data contained in the three datasets to produce an predictive outcome. The first time it run, the outcome was ok but I wanted some expansion of the outcome dataset and I asked GPT4 to do this. That’s when it got into memory errors everytime it was trying to compute the same files as before saying that he could not perform the operation because of memory limitations. It tried several times to use a simpler approach in combining data to reduce the memory imprint but without success. Is there any way to maximise the memory it uses? or a way to tell it to do things differently that might enable it to do the job it was asked?
Hi and welcome to the forum!
In general ADA should not have issues when working with simple files of this size. This leads me to think that maybe your are encountering a temporary issue with the environment or the script produced by the model is super inefficient. In the second case it is not a guarantee that the model understands what it is doing wrong. For example loading all files in a loop until memory is full…
You could start a new conversation, repeat the process and provide the code of the failed attempt in order to not repeat the same mistake.
Since this is the developer forum this is not exactly the best place to resolve the specific issue with your script but I hope this helps to resolve the issue.
Aren’t the data files subject to the same token limits as conversation? I’ve found when giving it long script files (thousands of lines of code) it gives memory errors as well. The only way around it that I’ve found is to break things into chunks of ~2048 tokens. You can go up to 4096 in ChatGPT (more I think in the GPT4 API) but the closer you get to that limit the less it remembers your prompt.