Just after January 1, I noticed that ChatGPT 4o suddenly became sloppy. Not only that, it routinely stops in the middle of a job and reports that it has completed the task. I’m not going to give specific prompts (more on that below) but here is an example:
- Upload a Word document or PDF with a Table of Contents and 90 sections. Instruct ChatGPT to list the 90 sections, the titles of the sections, and the first paragraph. Put the output into a CSV file.
ChatGPT will do the job very well for 3, 11, 23, and even up to 40 of the sections. But, it never will complete the job for all 90 although the will claim it does. Complain. Ask it to check its work. It’ll start completely over and provide what seems like a random sample of the titles.
- Ask ChatGPT to evaluate its CSV file when complete and verify that it has addressed all 90. It will give a play by play look into what it’s doing. It’ll do a few… Try again, starting completely over from the very beginning rather than picking up where it left off, and iterate over that maybe 8 to 10 minutes!
Finally, it will produce a CSV file with 50 or less real answers. But, for the other 40, it will write in “PLACEHOLDER”. How sneaky.
I see the Python code being written to perform this task. It looks correct. The Python dictionary with the table of contents values just is never fully populated. Sometimes there will be just a few values. Other times there will be many more. Never the 90, though.
Here’s the kicker… I converted the Word/PDF document to a text file and used it in the OpenAI API with the gpt-4-turbo-2024-04-09 model. Provided the same prompt instructions. The API created a perfect, correct, CSV file for all 90 items immediately.
I have also been having problems with the ChatGPT 4o alphabetizing a simple list of 15 to 20 words. It does sloppy things like just looking at the first letter of the words. So, it will do things like put the word “been” before “bean” in an alphabetized list.
Again, put the same list and prompt into the API and there are no issues.
Something has broken ChatGPT over the past 2 weeks. And I find it strange that I’m the first post about this.
Any thoughts?