Feedback on May 12, updated - coding specific

Using GPT-4 model via the web interface to validate code snippets and for general bug hunt and fixes.

It would appear that since the last update the quality of the interactions has changed/declined.

  1. Persistence of prompt instructions has declined - example: Please do not reference library xyz please use library abc for reference. After a prompt interaction subsequent prompts forget these instructions. Even when some class or portion of library has been specifically given as part of the prompt.

  2. Irrelevant additions to code have increased - example: I’d like to accomplish xyz given the following code. The model returns additions to the code that have vague relevance of what was asked. It takes the prompt attempts to give back what was asked for but also appends additional irrelevant functionality to the particular prompt.

I’ve ended up blowing through the 25 prompt limit, by attempting to steer the model in the right direction as the results have been replying with irrelevant or really off the mark responses. Previously I could manage to get things accomplished with the limit, albeit, the 25 prompt is rather limiting to say the least.

Now, I have to take into account that my prompting may have declined in quality over the last 3-4 days but I have my doubts about it since I’ve been using the same format that I have used across the previous version.

I’m curious if anyone has had a similar experience with this update.

Yes, i can report similar experiences. It strongly feels like the last versions after the plugins-introduction were steps in a direction that resulted in worse performance in reasoning and coding to me. Before the coding results were good enough to save me time, even for quite complex tasks. Now the quality of the results are in a way that the returns in terms of time for complex tasks are negative. And for simple tasks, it is not really worth the effort. I am really considering to stop my plus subscription. And looking for alternatives that have similar performance as the previous versions.

Much more small mistakes in coding, introducing weird new variables, invalid operations (e.g. multiplying non-matching matrices). Where before GPT-4 feld like chatting with a extremely good PHD student, now it feels like chatting with a 18 year old.