Getting Frustrated - starting to feel OpenAI just isn't usable

hazel1 · June 23, 2023, 9:40pm

Lang Chain was on the to-do list but then i got distracted by functions and the promise that it would obey the system prompt more strongly.

My project is in php though. My python is beginner level really (or at least beginner with a lot of experience in other languages), so the examples are tough readinh

vb · June 23, 2023, 9:51pm

I think Bruce’s idea has some merit but I interpret it differently. How about a “classic refactoring” of the module for “requesting suspect, means and motive for the “arrest warrant” function.”. Split it up into small pieces and only pass in what’s needed for the next step. As soon as you get the situation under control you can factor these requests into a single call again. Until then you may have to do a little writing to hide the fact that this part takes longer than expected from the player.

bruce.dambrosio · June 23, 2023, 9:54pm

nice idea. It will take extra turns, though, and given gpt response times that may or may not be a problem.

hazel1 · June 23, 2023, 10:17pm

Extra turns is fine, that’s how it’s supposed to work. It’s how their own examples work. You give it a function and tell it the parameters, in their example it’s weather, so it needs like a location and time frame, if the user doesn’t provide them, its supposed to ask follow up questions to get the info.
I’ve used the same phrasing as they have in their examples to ask it to ask the user rather than making it up.
Yesterday, it asked the follow up questions then triggered the function. Today, it makes it up the answers and triggers the function right away.

vb · June 23, 2023, 10:55pm

Just as an idea: can you create a message that is displayed to the user as if it was from the model and pass the response to the function? Put differently: a simple old school form that will always ask the questions you need before handing over to our beloved RNG machine?

PriNova · June 24, 2023, 12:39am

I think this function calling approach is better at one-shot prompts.
In my opinion it is beneficial to change the models forth and back for the tasks at hand and go back to good old GUI design.
I mean, why to ask for “What is the weather in San Francisco, CA for tomorrow?” if you could have a simple search field, where you input the first 3-4 letters and have a list to choose from and press a button to show the results in a modern graphical view, like rain radar-, cloud-, wind-maps etc.

Are we really back to the days, where text matters so much? No, even if NLP tasks have their advantages like (multi-language, accessibilities for color-blinds, voice-to-text, etc) we should focus back on a balance between GUI and NLP tasks and using prompt input fields where it is appropriate to use it for. And use buttons, sliders, tree-views, labels etc in a traditional way it was good before.

I believe we are in a time, where ChatGPT is only a half-way solution to a more interactive way. If the systems are ready to fully leverage speech synthesis, universal translator etc, then plugins will be most interesting. Especially when Microsoft launches their Windows Copilot and the ability to have plugins as services. A modular Windows System tailored for the user privat or business.

Screenshot 2023-06-24 014441

hyper5ai · June 24, 2023, 6:12am

I am also building a game. My multiplayer video game (Chasm Conquerors’ Challenge) is very large, so I am going to get the players of my game to build almost all of it. (Possibly you may end doing the same thing. Possibly we could help each other to build our respective games. I am currently using GPT-4 to create four extremely unorthodox main game design documents.)

Numbers are ALIVE! Play is learning to learn. Language is a game that you win, when you convey your meaning clearly and succinctly. Ergo working with AI is a very fluid, dynamic and organic process. Moreover GPT-4 is a baby AI, so it stands to reason it will have teething issues. I suspect GPT-5 through GPT-7 will also be baby AIs.

Try GOING with the FLOW! Try thinking outside the box. Try utilizing extremely unorthodox methodologies. Try adapting your game to allow for GPT-4’s weaknesses and to take advantage of GPT-4’s strengths. Imagine yourself in a canoe in very young, raging river - that is GPT-4. All you can do is try to avoid the rocks, survive any waterfalls and make your way to shore as swiftly as you can. Yes the documentation is of exceedingly little use, as even its creators do not fully understand GPT-4. Ergo we are all heading into totally unchartered waters - GPT-4’s creators inclusive. Hence far and away the best advice is:

NEVER give in! NEVER give up! NEVER count the odds!

EricGT · July 2, 2023, 12:27am

A post was split to a new topic: Interested in learning prompting skills

EricGT · July 16, 2023, 9:17am

A post was merged into an existing topic: Interested in learning prompting skills

sandrat · June 24, 2023, 9:17pm

sorry, lurker shuffles out of the shadows… question , your game was that the one that did well in lablab hackathon? sorry totally off topic.

vgopinath · June 25, 2023, 1:16am

100% True, I’ve raised separate thread. GPT is 6/10, what used to be 9/10 1 month before.

vgopinath · June 25, 2023, 1:25am

Wake up GPT-4 Team, y’day tested Bard, seems (my son too tested) it is now far better.

codie · June 25, 2023, 5:54am

Hey @hazel1,

I used to be a game dev, and believe me, it is mostly thankless and lots of iterating is required. But in the end, it is something that you should do for yourself, not for other people. You should think of it as a hill that you intend on conquering. A hill is a hill no matter how you look at, the only thing that matters is “do you want to get to the top of it or not”?

I’d say don’t give up hope on it, it seems really cool and when you release it, it will ignite the game development community’s imagination on fire. There are some really good techniques out there that you can lean on to get some good results. You’ll just have to keep on iterating. I’d say add some rule checking in on the system. Simple prompts like:

"Does the character reference them self?"
"Does the character's line match their backstory?"
etc..

I’ve outlined how I have kept an experimental sales representative bot on track here:

I’ve learned better techniques (Like two shot examples you have mentioned), so mine are a little outdated. But the bot was good enough to talk to customers and stay on topic. Additionally, though, if you are worried about quality control, you could take a more guarded approach and either modify the user’s questions to conform to something that gets the results you are looking for, or you can generate several questions from a player query like “how did you get here?” where the questions generated are in the style and form you need for good results.

In gaming it’s all about giving the player the illusion of being able to do more. They don’t necessarily need to interact with the AI in a flawlessly human way, it just needs to feel that way. So if you need to make template responses that get modified and Frankensteined together or guide users in the prompt crafting, then you can do that. You’re the first one making this type of game, so you get to set the standard and people just have to deal with it. Trust me, as long as it’s fun, nobody will notice or care. You think anyone really asks why we are limited to 52 cards 4 suits and 2 colors in card games? Nope, that’s just how it is.

generalbadwolf · June 25, 2023, 6:04am

its because youre frustrated. The AI gets smarter and more confident the morso you are.

vb · June 25, 2023, 8:15am

It seems there is a misconception about what the perceived issue is: we are talking about doing the same thing as before and getting different results. It’s not about never having been able to get good results in the first place.

Put differently: if I use ChatGPT4 and copy& paste my prompts then I am doing this because my prompts deliver the expected results. If this changes and the prompts don’t produce the expected results any longer then that’s what makes people question the service and not themselves. A healthy attitude by the way but also not super helpful at times
My experience is that everytime there is a new model in the background then a rewrite of the prompts is needed, which is somewhat expected from a beta version. It also requires to adapt the way of communicating during a chat conversation.
But then again, this is not what OP is describing. OP is additionally talking about getting different results from one day to the other and this can maybe be explained with a small sample size.
So hang in there and if you made it this far you can conquer this hill as well!

EricGT · July 16, 2023, 9:18am

A post was merged into an existing topic: Interested in learning prompting skills

EricGT · July 16, 2023, 9:19am

A post was merged into an existing topic: Interested in learning prompting skills

dmcov · June 26, 2023, 6:35pm

Have you created a sanity check to help filter incorrect results using ChatGPT itself? For instance, have it set the basics of the game:

The Chef committed the Crime
The maid seems suspicious
The butler gave the chef a gun

Then use these as a litmus test to reparse any answers and reject them if they don’t follow the laid out paradigm by interrogating the answers themselves before they are shown? It would help control hallucination. The other way to do it is to “somewhat” pre-can a master set of results, choose from those, but have ChatGPT rewrite them with variation.

seaofarrows · June 26, 2023, 7:29pm

I’ve also been working on text adventures using GPT, and gave up on 3.5 pretty fast. GPT-4 is much more rational and coherent. 3.5 might be cheaper, but I think this is a use case where you need GPT-4 if you want joy.

mechanicmuthu · June 28, 2023, 4:14am

Probably levers you can pull/push to get the desired behaviour:

Give more examples using the user/assistant tuple: system message, user, assistant user assistant, user, assistant, user: (however more work it is for you, this helps). The more the better.
Use gpt-4 (if its worth the cost. gpt-3.5 is not production ready)
Limit the output number of token. (add this in system message also)
Obtain more than 1 variation of answers using the n parameter. Have a separate evaluator gpt independently check of the response is of the required format and tone, if not check the next version.
Setting low temperature does not help. If gpt was making mistakes, it will repeat the mistakes in 0 temperature. Have a higher temperature (0.7) you have more chances of any one of the responses being perfect.

Topic		Replies	Views
Structured Outputs not reliable with GPT-4o-mini and GPT-4o API structured-output	38	6149	January 23, 2025
I'm going to be honest here: since the release of the GPT-4o updates, ChatGPT has been getting more and more problematic. My GPT responses are not what they used to be Plugins / Actions builders gpt-4 , plugin-development	67	5559	February 12, 2025
GPT-4 vs GPT-4o? Which is the better? Community gpt-4	44	244414	January 4, 2025
How do you make GPT3.5 API use specific keywords? API gpt-35-turbo , api , python	44	6013	March 20, 2024
Need Help With Prompts? Ask me* Prompting chatgpt	149	18732	February 6, 2024

Getting Frustrated - starting to feel OpenAI just isn't usable

Related topics