Code Interpreter started hallucinating when I uploaded a large file. Subsequent smaller files had the same problem. I’m not sure if it’s hallucinating or reading the wrong file. Trying to make sure there’s not a security issue here.
Uploaded the json from here - github /json-iterator/test-data/blob/master/large-file.json
Response was from a completely different json file which I did not upload.
I have also noticed this for long JSON files. When I asked code interpreter to analyse a particularly long JSON it first told me that it contained a list of people, ages and professions (which it did not). Asking again about he same file it returned weather information (also not what was in the JSON). Finally it gave we some nonsense about scores for some game. All of these were just made up results that had nothing to do with my actual data.
It happened to me the other day with a .csv dataset.
It was a small 28 days fresh Twitter dataset. The analysis itself (Python code) wasn’t incorrect, but the explanation was distorted.
For example, it didn’t correctly relay a string in the dataset (“most popular tweet”), by changing the contents of that tweet (it retained around 60% of the correct text, but changed the rest and even “mentioned” an account that I had never interacted with).
It took several conversations for it to rectify and relay the correct string.