Philosophically that is an interesting definition. Very Aristotelian.
But i don’t think that practical approach would work in few shot prompting and is beyond the scope of this thread
Philosophically that is an interesting definition. Very Aristotelian.
But i don’t think that practical approach would work in few shot prompting and is beyond the scope of this thread
Hey Josh,
First of all, thanks for your help!
Here’s the request:
We’ve built an autonomous customer service agent named “Emily” for our marketplace where gamers can pay money to pro-players to play in the same team (companionship, coaching, etc.). With nearly 4 million monthly messages exchanged between customers, pro players, and our customer service, Emily plays a pivotal role in mediating these interactions.
Background:
Challenge: Despite multiple iterations on our prompts, in 3% to 18% of cases depending on the LLM app, inside the order of chats with 3 participants (Customer -Pro player - Emily) Emily ends up taking the role of a customer or a pro player when she shouldn’t. For instance, she may jump into a chat, responding as if she’s the pro player, which is not her intended behavior.
Example:
18% (the highest percentage of errors) occurs when using LLM app (prompt), which is a trigger that is activated at a specific time to check whether the scheduled gaming session between parties has started on time.
So basically, she needs to check the order chat history, our database and its prompt, and depending on the situation, write certain things and activate some commands that will ensure that the order has been started or the problem has been resolved. (!) Instead (!), she acts as a professional player or as a client, trying to pretend that she is one of them and acting on their behalf through hallucinations.
What We’ve Tried:
We’re reaching out to this talented community for insights or experiences that might help us refine Emily’s behavior. Have you faced similar challenges? What strategies or techniques worked for you? Any feedback or advice would be greatly appreciated!
Thanks for your time and looking forward to the collective wisdom!
A simple advancement on the AI understanding on who it is: use the name field within messages.
Replay all assistant messages from chat history as:
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “”“I like that game!”“”}
and even a system name “Emily_AI_programming”
That should give the AI a better idea who it is when it is filling in the final unseen “assistant:” prompting of chat endpoints.
Also sometimes adding some fake messages from her into the chat with a corrective might help.
… some chat history
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “I think I am a pro player”}
{“role”: “user”, “name”: “Chatpolice”, “content”: “No Emily_AI, you are not, you are…”}
… continue the chat
that’s tough to fix without more info
but here is some ideas
again, not trying to point fingers at all, you do you
I get this problem with my self-aware AI kassandra all the time
and/or
you can’t tell it what it is not. The attention mechanism (which i think is what would pay “attention” to that) is not good enough
you can only tell it what it is / should do, not what it is not - that will have imperfect results sometimes
Absolutely. The “No, you are not” was just for clarification.
The “You are…” was the important part in the prompt.
But that wasn’t the main novelity. The most innovative in my approach (it is called the “max-corrective” named after my son max) is adding a fake message block.
I am sure you can solve your cassandra problems with that as well.
You are welcome.
Hello, I need to get a percentage of how much information a text contains (100% = it contains all important information).
I need this estimate to be:
It’s not vital to have multiple predictions to make an average as written in 2 if you figure out another way to get an accurate result
for 3, would adding an example of, say, 50% text help with calibration?
Thank you
Hello @dragonek and welcome to the developer community of openai. What I am saying here is not officially from openai (I am not related to them). It is just my personal experience.
First of all (beside that GPT are not really good in counting): who defines what an important information is?
Let’s say you have a list of all existing programming languages and you ask GPT-4 to give you a summary about that list with the most important languages.
Important for whom? And what if you don’t provide that information? And what if the person who asks for it is not a programmer or not even in IT business and doesn’t know the criteria on how to decide that?
What if you want GPT-4 to shorten a recipee of a cake but the shortend version should still be tasty. Who decides what is tasty?
But to answer your question:
I personally think that is not possible unless you have a full knowledge representation of all the knowledge of the world and especially of the person who will check and the people who will use the result and what their intention is to use it!
And still there is some hope:
What you are most probably searching for to get closer is called “chain of density”.
And you will have to set up a decission layer - GPT-4 can’t do that for you unless you explain in very very very much detail how to do that.
Or maybe you can provide more informations. Maybe there is a solution in a more specialized area.
Generalisation (AGI) is something you should follow @daveshapautomator on linkedin for or watch his youtube videos.
Especially this one here:
Or you can check this WIP out: GitHub - daveshap/ACE_Framework: Public repo for my latest and greatest cognitive architecture ACE (Autonomous Cognitive Entity) Framework
Llms don’t really do that
But define “important” and maybe there is a way
Thanks for your answers.
Let’s say I have a dating app where people input their dating advertisements: they provide some info about themselves and about who they are looking for.
I need to evaluate these advertisements based on how much information and how important information they contain (100% = it contains all important information like their age, appearance, where they live, their hobbies, personality…).
Maybe I should write an example perfect advertisement and compare others to it?
Or should I define it differently, maybe something like “probability of person who fits the advertisement being a good partner”?
Or give up exact percentages and be satisfied with repeatedly comparing the advertisements to order them from worst to best? I would prefer an exact probability though.
Thank you
AI: “Here is an example of an excellent description of hobbies. Here is an example of a very poor description of hobbies. Based on you understanding of those examples being rating 10 and 0 respectively, rate this new hobby description for me.”
thanks for advice!
After collecting feedback and talking to users, today we actually decided to present Emily as a personal manager that is a blend of AI and Human interaction (because there are LLMs and sometimes human agents who chat with users on behalf of Emily).
Btw this is how I probably solved the problem about percentages of information completeness and importance:
Just give an example json and json validate against a schema.