For PHP development you still get some good results but it is getting worse because of missing new data. I am already thinking about embedding documentations of newer version e.g. Symfony and building a plugin so ChatGPT can ask it before it answers…
On the other hand when people complain about “hallucinations” I think that’s something that at least in programming is needed. Sometimes there is no library for something and you can barely find informations about it on the web either, but still ChatGPt hallucinates something for you that in most cases doesn’t need much work afterwards.
I even like that it hallucinates comments to your code haha…
Therefore, I unsubscribed. I subscribed to Plus very early on, because ChatGPT was very good at writing code. However, now it is clear, whether it is 3.5 or 4, they basically cannot complete the task of writing code.
Agreed. My entire Roguelike project was written with GPT-4 help… You have to know how/what to ask and give it enough info… and know enough not to be lead down bad paths… It increases my productivity as much if not more as using it for content writing (fiction)…
It got me to thinking about the subject of this thread.
One thing we can all do is use the thumbs up/down icons to let OpenAI know when the models responses are good, and when they are terrible. I’ll admit I’ve not done this enough, but I’m going to start being a lot more diligent at it. Hopefully, over time, if enough of us start doing this we will see some improvements.
I subscribed to chatGPT back in March, because GPT4 was so good at writing code, now I have exactly the same complains as the rest of you, it doesn’t feel like GPT4 from March, it feels like GPT3.6
Even worse, the model can’t do simple things now, if I ask it to provide me a list of 10 things, it provides me a list of 3 things, and starts rambling about character limit! I don’t mind clicking continue button, I do mind a bot that refuses to do what I ask it to do.
Yesterday I created a simple prompt to extract data from a PDF and generate the result in json format, gpt-3.5-turbo returned the correct data. Today I tried the gpt-3.5-turbo-16k and it returned saying that was unable to extract the data. Now a few hours latter the gpt-3.5-turbo is giving the same result as of 16k, it is not able to extract the data.
Everything is the same, I just changed the model name.
I honestly think they just messed with the model parameters on chatgpt. Something like including a note to be concise in the system prompt and reducing the de facto token limit in the chat and max completion rate. It would seem to fall in line with what most people are complaining about.
But like, I really would like to see a side by side comparison at least one where the results seem worse. I’ve been doing my best to compare old prompts to the current responses on chatgpt and I just have not noticed anything.
Check out that thread I did a little testing of my own and I’m honestly either not sure if I fully understand your code but I did my best. Overall, I got 1 crappy response out of three tries. Though I am not able to tell if the end result was satisfactory in the end. I need to get back to work, however I’m interested to see how youd grade the responses I was able to get
I was on an external production the whole of last week and didn’t use it for some 9 days until last night. I was appalled by the drop in quality, it felt like I was using GPT 2.5 or earlier. I’m a plus user and rely on it daily for work. The experience has been smooth and consistent since March, but I struggled with it for almost 24 hours yesterday. Eventually, I dropped it and did everything the “old” way (myself). There are many questions here, but mostly I feel ripped off by OpenAI as a paid subscriber.
It is clearly now waste of time using GPT-4, my son, using for his 12th grade class help, for the past 2 weeks, screams (addicted before how wonderful gpt-4 was…) F… word…
GPT-4 Team wake up.
Today he tested Bard, it just solved the math issue in one go…
100%, I have felt it. Like they are reducing compute. I used to max out GPT4 and then have a break and continue when my 25 resets… it was worth it… the output for me was incredible. I find now I am really questioning whether I use what the model responds with. ( :sad, we used to be partners!) Now it’s like a bad intern who confidently tells me the wrong thing and I have to catch it all the time. Here’s hoping that they increase compute power or go back to GPT4.
I’ve seen a lot on the topic, I’ve also participated in the topic for the past month or so. GPT-4 was truly amazing at writing whole complex snippets of code logic before, using various prompting techniques + describing the task from a high-level overview was enough for me to get an almost perfect match of what I’ve needed.
Once they dropped the iOS App + the Plugins (which happened in a short timespan last month), it started forgetting context from messages which were literally placed 1-2 messages before the current one, the context length itself I believe was significantly reduced, because previously I was able to paste long (300-400 line) snippets into it, and it’d do very well remembering that with almost no reminders from myself for a fairly lengthy conversation, currently it doesn’t do that, you paste a snippet which has the name of some function, three messages below if you ask it about that function, the chances that it’ll start hallucinating something completely different have gone through the roof.
Don’t get me wrong, I am still able to get it to do what I want it to do, the issue here is that it’s become singificantly harder to do it. I’m actually a hard proponent of the idea to add “tiered subscriptions”, where the bigger tier subscription you have, the more computational power is allocated to your personal experience, in this way I know that at the very least, what I am paying for is worth the time.
For the moment, using the Playground for me works - being able to actually tweak the configs actually makes a singificant difference
I agree with the subject in this post and add my concerns about.
I have tried the GPT 0613 model capabilities with a series of tests and we are concerned about the loss of logical capabilities of this model from the previous version 0314.
One example: asking if a value of 4500 was inside the range between 4000 and 6000, the old model was capable of understanding this, but the new one isn’t. This creates a big trouble in understanding data inside HTML tables.
I strongly recommend OpenAi to extend the deprecation date of the 0314 model from the current 09/13/23 to a much longer time and to clearly declare if the current model 0613 has to be considered as the new state-of-the-art for the future (or if you are planning to fix the model).