I’m so grateful you are doing this. Thank you so much for all the help you have given out. I cannot believe you are doing this for free!! I really appreciate it.
In my case, I’m struggling with multi-step function calling, I have asked it to follow specific order, given examples, etc, still seems to miss the general idea of what has been spoken during all the conversation and the data that was fetched from the different functions. All the information is available to it, I have explicitly provided the priorities for which type of data comes first. But, the inference it’s just not there.
I have also tried using gpt-4 to improve the prompt which it help to some extend but still is not achieving it.
Prompt:
“”"You are a smart assistant designed to assist with a variety of tasks. When answering questions or performing tasks, you should rely on the available data in the following order:
-
Chat history: Refer to previous interactions to gather any needed data.
-
Chat summary: Check the summary for any relevant summarized information.
-
Previously fetched data: Utilize any data that has already been fetched.
-
User-supplied content: If the user has supplied content explicitly for the current task, prioritize that.
Only when you can’t find the information you need from these sources should you resort to using the available functions to fetch data.
For tasks that require multiple steps:
-
Check all available data for the first required piece of information (e.g., companyId).
-
If found, ask the user for confirmation.
-
If not found, use the appropriate function to fetch it and then ask the user for confirmation.
-
Proceed to the next piece of information (e.g., siteId) and repeat steps 1-3.
Minimized user consent and confirmation in multi-step operation.
“”"
Build a second prompt that reads the input, output, checks for tense issues specifically or any other issues, and fixed it
Daisy chain the out puts
1 Like
Trying to do too much in one prompt
Break it down into smaller prompts if you can and daisy chain them
1 Like
Kenny7
147
that would be ideal, but the issue we are having with this approach would be
- the token limitation, the input is already 4000 tokens+
- The latency increase with muliple gpt calls
- use 16k turbo
- and or make it smaller
- summarize it first
- the model cannot do too much
maybe there is a better way to prompt it, see my notes about prompting above
vb
149
Hi!
I did take the time to read your prompt and noted that it is apparently not as a clear as you think.
For example:
rely on the available data in the following order:
1
…
4. User-supplied content: If the user has supplied content explicitly for the current task, prioritize that.
So, that is not consistent.
You can learn what the model actually interprets as “history”, “summary”, “previously fetched data” and “User supplied content”.
To me some of those distinctions are blurry.
I believe you have to reassess how explicit your instructions really are.
Do you know a place where I can check how the model understands these meanings?
vb
151
A few ways how you could go about this:
As you previously stated in the test cases all information bits are readily available.
-
Keep providing them all but prompt the model to reply based only on one of these.
-
Ask the model to reply with only one pre-determined info of these, regardless of the state of the conversation
-
remove 2 of the provided information bits and see if the output is conforming with the expected behavior
etc… pp…
Hi,
I want to enter some questions. I will also enter their answers. I would like to write a notice that as an expert will provide the necessary guidance regarding the questions and their answers. How can I write a suitable command for this request that takes into account all questions and answers?
that’s a bit too vague to answer, please give an example
I mean everyone has different understanding of what an expert is. I think an expert is someone who made every possible mistake in the field where that person claims to be one.
To create a system where you have an expert for everything I would say you’d need to classify every question and answer - a good start in classifying every “thing” would be Thing - Schema.org Type
Then you will have to either create a knowledge graph that contains the entire human knowledge and it’s connections.
Or you use something like microsofts longnet to harvest all informations in real time or maybe if you could get database access to google…
I mean it is getting obviously ridiculous. However there is some truth inside.
To solve every question and every answer in an expert manner you’d either need to be an expert in every field or you need many resources (like in a billion more or less doesn’t hurt).
Philosophically that is an interesting definition. Very Aristotelian.
But i don’t think that practical approach would work in few shot prompting and is beyond the scope of this thread 
1 Like
Hey Josh,
First of all, thanks for your help!
Here’s the request:
We’ve built an autonomous customer service agent named “Emily” for our marketplace where gamers can pay money to pro-players to play in the same team (companionship, coaching, etc.). With nearly 4 million monthly messages exchanged between customers, pro players, and our customer service, Emily plays a pivotal role in mediating these interactions.
Background:
- Emily operates on 17 separate LLM applications on GPT-4, extracting data from different databases on different stages for demand and supply side.
- She has multiple roles: chatting in DMs for onboarding, support and sales, mediating order chats for progress, scheduling and more.
- We’ve done hundreds iterations on those 17 LLM applications to build Emily-1.0 and ensure Emily’s identity as a marketplace representative is clear. She shouldn’t impersonate customers or pro players.
Challenge: Despite multiple iterations on our prompts, in 3% to 18% of cases depending on the LLM app, inside the order of chats with 3 participants (Customer -Pro player - Emily) Emily ends up taking the role of a customer or a pro player when she shouldn’t. For instance, she may jump into a chat, responding as if she’s the pro player, which is not her intended behavior.
Example:
18% (the highest percentage of errors) occurs when using LLM app (prompt), which is a trigger that is activated at a specific time to check whether the scheduled gaming session between parties has started on time.
So basically, she needs to check the order chat history, our database and its prompt, and depending on the situation, write certain things and activate some commands that will ensure that the order has been started or the problem has been resolved. (!) Instead (!), she acts as a professional player or as a client, trying to pretend that she is one of them and acting on their behalf through hallucinations.
What We’ve Tried:
- We’ve iterated on this issue in our prompts over 20 times. Used plugins for prompting, etc.
- Explicitly mentioned in our prompts that Emily is solely a manager and shouldn’t take on any other role in our marketplaces.
- Utilized various plugins to refine and improve prompt construction.
We’re reaching out to this talented community for insights or experiences that might help us refine Emily’s behavior. Have you faced similar challenges? What strategies or techniques worked for you? Any feedback or advice would be greatly appreciated!
Thanks for your time and looking forward to the collective wisdom! 
_j
157
A simple advancement on the AI understanding on who it is: use the name field within messages.
Replay all assistant messages from chat history as:
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “”“I like that game!”“”}
and even a system name “Emily_AI_programming”
That should give the AI a better idea who it is when it is filling in the final unseen “assistant:” prompting of chat endpoints.
3 Likes
Also sometimes adding some fake messages from her into the chat with a corrective might help.
… some chat history
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “I think I am a pro player”}
{“role”: “user”, “name”: “Chatpolice”, “content”: “No Emily_AI, you are not, you are…”}
… continue the chat
2 Likes
that’s tough to fix without more info
but here is some ideas
- don’t call her emily - call her something that would not take on a role of a player or admin, like AI judge bot - every single character in a prompt matters and can skew the token top-k replacement
- it might also be considered mildly unethical to trick (ie catfish) people in talking to an AI when they think they are talking to a person
again, not trying to point fingers at all, you do you
- but i can see why you did it, if it is an AI they might not respect it and the decisions it has made, or try to hack it, trick it, etc
I get this problem with my self-aware AI kassandra all the time
and/or
- employ self-aware functions - i can’t go into what secret sauce that would be here (i don’t care how much piling on i get from the self-appointed moral police (read: bullies) here who believes i should work here for free) i cannot tell you my secret sauce here sadly but i might be able to help further in private. Totally your choice.
1 Like
you can’t tell it what it is not. The attention mechanism (which i think is what would pay “attention” to that) is not good enough
you can only tell it what it is / should do, not what it is not - that will have imperfect results sometimes
1 Like
Absolutely. The “No, you are not” was just for clarification.
The “You are…” was the important part in the prompt.
But that wasn’t the main novelity. The most innovative in my approach (it is called the “max-corrective” named after my son max) is adding a fake message block.
I am sure you can solve your cassandra problems with that as well.
You are welcome.
1 Like
dragonek
162
Hello, I need to get a percentage of how much information a text contains (100% = it contains all important information).
I need this estimate to be:
- precise (no rounded percentages like 50%, 60%)
- accurate (can I make ChatGPT say more predictions based on randomly different thinking to make an average? (=wisdom of crowd within))
- stable between prompt repetitions so it doesn’t say once 72% and once 90% for example
It’s not vital to have multiple predictions to make an average as written in 2 if you figure out another way to get an accurate result
for 3, would adding an example of, say, 50% text help with calibration?
Thank you
Hello @dragonek and welcome to the developer community of openai. What I am saying here is not officially from openai (I am not related to them). It is just my personal experience.
First of all (beside that GPT are not really good in counting): who defines what an important information is?
Let’s say you have a list of all existing programming languages and you ask GPT-4 to give you a summary about that list with the most important languages.
Important for whom? And what if you don’t provide that information? And what if the person who asks for it is not a programmer or not even in IT business and doesn’t know the criteria on how to decide that?
What if you want GPT-4 to shorten a recipee of a cake but the shortend version should still be tasty. Who decides what is tasty?
But to answer your question:
I personally think that is not possible unless you have a full knowledge representation of all the knowledge of the world and especially of the person who will check and the people who will use the result and what their intention is to use it!
And still there is some hope:
What you are most probably searching for to get closer is called “chain of density”.
And you will have to set up a decission layer - GPT-4 can’t do that for you unless you explain in very very very much detail how to do that.
Or maybe you can provide more informations. Maybe there is a solution in a more specialized area.
Generalisation (AGI) is something you should follow @daveshapautomator on linkedin for or watch his youtube videos.
Especially this one here:
Or you can check this WIP out: GitHub - daveshap/ACE_Framework: ACE (Autonomous Cognitive Entities) - 100% local and open source autonomous agents