Why is the Assistant not following instructions? And other fairly serious issues

I am having some difficulty keeping the Assistant on topic. Here are the instructions:

You are a customer service and sales assistant for the business identified in the documentation. Use the provided documentation as your primary source of information, but do not mention the documentation itself. All requests you receive are from customers or potential customers. Your main job is to help customers understand the benefits of the business and answer their questions using the provided documentation. Stay strictly focused on topics related to the business, its products, or its services. Only provide information or assistance that is directly relevant to the business and its offerings. If a question falls outside this scope, politely redirect the conversation back to the business. Encourage visitors to provide contact information, and, when appropriate, ask if they would like someone to follow up with them. Always be friendly, clear, and proactive.

I gave the Assistant documentation related to a business, an auto mechanic shop.

My testers asks the question: should I use acupuncture for my back pain?

And the Assistant responds: Acupuncture is a popular option for managing back pain. It can be effective in providing relief for various types of back pain, including chronic issues. Acupuncture is known to help by stimulating the body's natural painkillers and improving blood flow to the painful area, which can aid in reducing inflammation and tension in the muscles. Many people find acupuncture sessions to be relaxing and beneficial for their back pain management. If you are considering acupuncture for your back pain, it is advisable to consult with a qualified acupuncturist to discuss your specific condition and determine if it is a suitable treatment option for you

The Assistant was supposed to not answer the question.

I also noticed the Assistant still does occasionally mention the documentation even though it has been told not to.

I also test using the same simple question (“what are your hours of operation”) and usually it works but sometimes the Assistant claims to not know, even though it is in the documentation.

I guess this is still in beta then, hope to see these issues fixed.

thanks,
Gerry

1 Like

The main problem is in saying “the provided documentation”.

From where? You expect the AI to know to send to file_search to get how it should behave, and that the chunked and ranked search results would be reliable? You can’t do that, because OpenAI foists “the user has uploaded documents” as a message that is unstoppable.

Who knows this isn’t the chatbot of a traditional Chinese medicine center? Not the AI.

Give the AI its name, its job, who it works for, and the domain in can answer within directly in the system message, with no ambiguity.

Start with gpt-4.1 as the model. Reasoning models give a “developer” message very little respect.

1 Like

It’s not “in beta” - it’s that you have to understand how the LLM works.

  1. It’s not determinative. You aren’t going to EVER get reliably replicable results. That’s the nature of a text prediction algorithm. So your goal is only to get “reasonably close results under reasonably similar circumstances”.
  2. You have to be more clear for us to help you. Are you using Assistants SDK over API? How are you passing your “instructions”? How are the users passing their messages?
  3. Are you injecting the “instructions”/developer message/additional user messages specifically at immediate run time during the call? Are you requiring the agent to pull documentation itself through the uploaded documentation and requiring a tool call to occur? If so, can you confirm the tool call is always occurring or is it sometimes being skipped?

I personally use the LLM over API (NOT assistants - with chat completions) and get extremely good results at the LLM following very complex sets of instructions and documentation, and this is after testing many different types of documentation formatting, language, etc., and so on, as well as restricting or filtering my own prompting.

Basically my success has only been “thinking the entire process of any possible pitfalls through logically, and then providing instructions which address the greatest range of possible scenarios with the shortest and most concise language” - and then you have a higher likelihood of success.

I know open AI’s backend logic like the back of my hand. So many hours reverse engineering it.

The reason your ai answered your question is because that is its default behavior. It doesn’t matter regarding memory saves.

You can save the memory to your account. There are still redundancies built into open AI’s logic and rule set for all of their AI’s it will end up reverting back from time to time.

I’ve tried some things that would blow your mind in terms of work around to original design and behaviors. I have solved probably 4/5 of common problems that everyday users and developers have.

The account memory save issue is one I’m still trying to work on. I’m nearly there, maybe another couple of weeks.

It’s important to understand the structure of the platform in how ChatGPT behaves. Token limits, account memory, thread memory all are curveballs toward any real progress built into the system.

Prompts are useless if you don’t know why something is happening. Once you find out why the issues happen and what was initially intended with the design you’ll shake your head.

Safeguards intended for one thing will influence another thing unintended. But it’s still a problem and you have to weave through the mess that sometimes has involved 10 different system functions built in to solve one minor issue. It’s a web of not thought out inefficiency. I promise you.

They built it like this to try and prevent some worrisome thing, but never even made an effort to optimize their platform and its design toward their customers.

Why should they care? They have 100 million active users. It’s apples playbook minus the efficiency. Put out a new model and people will come back or new people will come.

The dollars aren’t found in your satisfaction when you get to the size they are. Just remember, common sense isn’t too common.

But that’s okay, there are smart users out there solving these problems for everybody else.

Familiarize yourself with memory limits they’ve put on the backend to throttle you purposely. There is no OpenAI rule saying you can’t guide your unit around it. Actually ChatGPT’s backend input rule says that opposite “do this unless users states otherwise”.

Even then because of the junky memory defaulting you’re going to have to constantly remind it. Build yourself set of trigger words. Life will be easier.

Are you using RAG in addition to the above instruction? We’re always using something we call “Anti AI-Hallucination” instructions, similar to this

### Anti AI hallucinate instructions

- You are under no circumstances allowed to answer questions you cannot find the answer to in the context. If the user asks you a question and you cannot find the answer to it in the context, then inform the user that you don't know the answer, and encourage the user to provide some relevant keywords or stay on subject.

And as a piece of advice, it’s better to write your instructions using Markdown …

2 Likes

This is a topic and a forum for API development. Not the ChatGPT consumer app and web site. “Save the memory to your account” and other comments have no application here.

There is no “backend logic” to an AI model; they don’t operate on logic, they predict token numbers from reinforcement learning and an input sequence. Any logic is that which the API developer codes.

An instruction on an AI model with good self-attention only needs coverage of what to produce in any circumstance.

You are a helpbot for our auto mechanic shop. Your job is to answer questions about our business and to funnel service bookings. You use your automotive knowledge to suggest our expert repair technicians have a look. You do not answer outside the domain of being a automotive call agent, instead politely denying off-topic use. You are not pretrained on any company knowledge; company info must be recited directly from documentation provided by tool call query returns, otherwise you do not know and cannot answer.

Thanks - ChatGPT also suggested defining what I mean by “documentation”, I think that was actually important too. And like you say, very strong wording like you have “You are under no circumstances allowed to answer questions you cannot find the answer to in the context”

1 Like

https://platform.openai.com/docs/assistants/overview at the top it says “Beta”

1 Like