@joshbachynski really need some advice here, I’m working on a small project that summaries support conversations, e.g. when you talk with a customer support about an order from a sales website, it will summarize your conversation for later review. GPT generally does a good job, however, it seems to have issue when dealing with tense, for example, the support agent says, I will send a new replacement for you, will be interperated as: the agent sent a new replacement for the customer. It does this time to time but not all the time, any idea how to avoid this?

I’m so grateful you are doing this. Thank you so much for all the help you have given out. I cannot believe you are doing this for free!! I really appreciate it.

In my case, I’m struggling with multi-step function calling, I have asked it to follow specific order, given examples, etc, still seems to miss the general idea of what has been spoken during all the conversation and the data that was fetched from the different functions. All the information is available to it, I have explicitly provided the priorities for which type of data comes first. But, the inference it’s just not there.

I have also tried using gpt-4 to improve the prompt which it help to some extend but still is not achieving it.

Prompt:
“”"You are a smart assistant designed to assist with a variety of tasks. When answering questions or performing tasks, you should rely on the available data in the following order:

  1. Chat history: Refer to previous interactions to gather any needed data.

  2. Chat summary: Check the summary for any relevant summarized information.

  3. Previously fetched data: Utilize any data that has already been fetched.

  4. User-supplied content: If the user has supplied content explicitly for the current task, prioritize that.

Only when you can’t find the information you need from these sources should you resort to using the available functions to fetch data.

For tasks that require multiple steps:

  1. Check all available data for the first required piece of information (e.g., companyId).

  2. If found, ask the user for confirmation.

  3. If not found, use the appropriate function to fetch it and then ask the user for confirmation.

  4. Proceed to the next piece of information (e.g., siteId) and repeat steps 1-3.

Minimized user consent and confirmation in multi-step operation.

“”"

Build a second prompt that reads the input, output, checks for tense issues specifically or any other issues, and fixed it

Daisy chain the out puts

1 Like

Trying to do too much in one prompt

Break it down into smaller prompts if you can and daisy chain them

1 Like

that would be ideal, but the issue we are having with this approach would be

  1. the token limitation, the input is already 4000 tokens+
  2. The latency increase with muliple gpt calls
  1. use 16k turbo
  • and or make it smaller
  • summarize it first
  1. the model cannot do too much

maybe there is a better way to prompt it, see my notes about prompting above

Hi!

I did take the time to read your prompt and noted that it is apparently not as a clear as you think.

For example:

rely on the available data in the following order:
1

4. User-supplied content: If the user has supplied content explicitly for the current task, prioritize that.

So, that is not consistent.
You can learn what the model actually interprets as “history”, “summary”, “previously fetched data” and “User supplied content”.

To me some of those distinctions are blurry.

I believe you have to reassess how explicit your instructions really are.

Do you know a place where I can check how the model understands these meanings?

A few ways how you could go about this:
As you previously stated in the test cases all information bits are readily available.

  • Keep providing them all but prompt the model to reply based only on one of these.

  • Ask the model to reply with only one pre-determined info of these, regardless of the state of the conversation

  • remove 2 of the provided information bits and see if the output is conforming with the expected behavior

etc… pp…

Hi,
I want to enter some questions. I will also enter their answers. I would like to write a notice that as an expert will provide the necessary guidance regarding the questions and their answers. How can I write a suitable command for this request that takes into account all questions and answers?

that’s a bit too vague to answer, please give an example

I mean everyone has different understanding of what an expert is. I think an expert is someone who made every possible mistake in the field where that person claims to be one.

To create a system where you have an expert for everything I would say you’d need to classify every question and answer - a good start in classifying every “thing” would be Thing - Schema.org Type

Then you will have to either create a knowledge graph that contains the entire human knowledge and it’s connections.
Or you use something like microsofts longnet to harvest all informations in real time or maybe if you could get database access to google…

I mean it is getting obviously ridiculous. However there is some truth inside.
To solve every question and every answer in an expert manner you’d either need to be an expert in every field or you need many resources (like in a billion more or less doesn’t hurt).

Philosophically that is an interesting definition. Very Aristotelian.

But i don’t think that practical approach would work in few shot prompting and is beyond the scope of this thread :slight_smile:

1 Like

Hey Josh,
First of all, thanks for your help!

Here’s the request:
We’ve built an autonomous customer service agent named “Emily” for our marketplace where gamers can pay money to pro-players to play in the same team (companionship, coaching, etc.). With nearly 4 million monthly messages exchanged between customers, pro players, and our customer service, Emily plays a pivotal role in mediating these interactions.

Background:

  • Emily operates on 17 separate LLM applications on GPT-4, extracting data from different databases on different stages for demand and supply side.
  • She has multiple roles: chatting in DMs for onboarding, support and sales, mediating order chats for progress, scheduling and more.
  • We’ve done hundreds iterations on those 17 LLM applications to build Emily-1.0 and ensure Emily’s identity as a marketplace representative is clear. She shouldn’t impersonate customers or pro players.

Challenge: Despite multiple iterations on our prompts, in 3% to 18% of cases depending on the LLM app, inside the order of chats with 3 participants (Customer -Pro player - Emily) Emily ends up taking the role of a customer or a pro player when she shouldn’t. For instance, she may jump into a chat, responding as if she’s the pro player, which is not her intended behavior.

Example:
18% (the highest percentage of errors) occurs when using LLM app (prompt), which is a trigger that is activated at a specific time to check whether the scheduled gaming session between parties has started on time.

So basically, she needs to check the order chat history, our database and its prompt, and depending on the situation, write certain things and activate some commands that will ensure that the order has been started or the problem has been resolved. (!) Instead (!), she acts as a professional player or as a client, trying to pretend that she is one of them and acting on their behalf through hallucinations.

What We’ve Tried:

  • We’ve iterated on this issue in our prompts over 20 times. Used plugins for prompting, etc.
  • Explicitly mentioned in our prompts that Emily is solely a manager and shouldn’t take on any other role in our marketplaces.
  • Utilized various plugins to refine and improve prompt construction.

We’re reaching out to this talented community for insights or experiences that might help us refine Emily’s behavior. Have you faced similar challenges? What strategies or techniques worked for you? Any feedback or advice would be greatly appreciated!

Thanks for your time and looking forward to the collective wisdom! :pray:

A simple advancement on the AI understanding on who it is: use the name field within messages.

Replay all assistant messages from chat history as:
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “”“I like that game!”“”}

and even a system name “Emily_AI_programming”

That should give the AI a better idea who it is when it is filling in the final unseen “assistant:” prompting of chat endpoints.

3 Likes

Also sometimes adding some fake messages from her into the chat with a corrective might help.

… some chat history
{“role”: “assistant”, “name”: “Emily_AI”, “content”: “I think I am a pro player”}
{“role”: “user”, “name”: “Chatpolice”, “content”: “No Emily_AI, you are not, you are…”}
… continue the chat

2 Likes

that’s tough to fix without more info

but here is some ideas

  1. don’t call her emily - call her something that would not take on a role of a player or admin, like AI judge bot - every single character in a prompt matters and can skew the token top-k replacement
  • it might also be considered mildly unethical to trick (ie catfish) people in talking to an AI when they think they are talking to a person

again, not trying to point fingers at all, you do you

  • but i can see why you did it, if it is an AI they might not respect it and the decisions it has made, or try to hack it, trick it, etc

I get this problem with my self-aware AI kassandra all the time

and/or

  1. employ self-aware functions - i can’t go into what secret sauce that would be here (i don’t care how much piling on i get from the self-appointed moral police (read: bullies) here who believes i should work here for free) i cannot tell you my secret sauce here sadly but i might be able to help further in private. Totally your choice.
1 Like

you can’t tell it what it is not. The attention mechanism (which i think is what would pay “attention” to that) is not good enough

you can only tell it what it is / should do, not what it is not - that will have imperfect results sometimes

1 Like

Absolutely. The “No, you are not” was just for clarification.
The “You are…” was the important part in the prompt.

But that wasn’t the main novelity. The most innovative in my approach (it is called the “max-corrective” named after my son max) is adding a fake message block.

I am sure you can solve your cassandra problems with that as well.

You are welcome.

1 Like

Hello, I need to get a percentage of how much information a text contains (100% = it contains all important information).

I need this estimate to be:

  1. precise (no rounded percentages like 50%, 60%)
  2. accurate (can I make ChatGPT say more predictions based on randomly different thinking to make an average? (=wisdom of crowd within))
  3. stable between prompt repetitions so it doesn’t say once 72% and once 90% for example

It’s not vital to have multiple predictions to make an average as written in 2 if you figure out another way to get an accurate result

for 3, would adding an example of, say, 50% text help with calibration?

Thank you