Does anyone else think this? The new models are much less creative, and the variety of outputs that I can produce is so much smaller. Yes, they follow instructions, but the outputs are short and repetitive. What is the point of using GPT-3 to tell me something I could find on Wikipedia? What is the point of using it for summarization, something that I could easily do with a smaller model? OpenAI is moving in the wrong direction with this. I understand you are trying to respond to the way users are leveraging the playground, but you are just giving them a “faster horse” at the price of limiting the true capabilities of the model.
I find that it depends on the prompt. If one gives it 2000 magical tokens, twice the magic back ye shall receive
Tokens are in fourths. Divide by 4. I understand what you mean by boring though.
Its definitely prompt-specific, but if I’m going to long form prompts, I will still say that the original Davincis are going to give me better, more creative results. I can’t just give them one sentence zero shot commands, but I also can just coax a far vaster variety of reactions from them.
Thanks @amandamariemoore714 for the advice…good to remember that its generating token-wise, not word-wise.
I still use davinci-instruct-beta for a lot of things. It’s the most creative, I think, better than davinci, but without as much fine control as basic davinci.
text-davinci has its niche, but I think it works best for “lazier” prompts. The older versions need a lot more work to have good output. The new ones are basically cheaper and more consistent. I have a poetry generator that runs on text-davinci-002 and it’s surprisingly accurate.
I do think it’s the wrong direction, but it’s fine as long as they keep the older versions too.
I hadn’t tried it tons/much but I felt it too, much less impressive than the original GPT-3.
Agreed, for creative writing, one of the best features of the original davinci model is the ability to press the regenerate button and get a completely different completion.
The new models output the same thing almost every time, which makes them useful only in marginal situations for creative writing.
My experience is that it became more neutral, more objective and less controversial. For example, when I ask it for phone recommendation, it tells me “there’s no one answer to which the phone is the best” instead of actually recommending something. I can change the prompt and I will get what I want, but I also preferred if the model was more subjective. The neutral and objective answers are often useless.