My most important function is being called only very rarely

So I’m working on a customer support chatbot for my company and my most important function (the one that when called will search our knowledge base to answer a query) is being triggered in about only 30% of the cases. I rewrote my chatbot to use functions as they are FAR more reliable than previous techniques, BUT I would say the accuracy of choosing my main tool (mycompany-search) actually went down by quite a bit.

My functions setup:

[{
  "name": "mycompany-search",
  "parameters": {
    "type": "object",
    "required": [
      "input"
    ],
    "properties": {
      "input": {
        "type": "string",
        "description": "Search query that preserves any relevant conversation context"
      }
    }
  },
  "description": "MyCompany Search. ALWAYS call this when you need to ask questions about MyCompany website, MyCompany products and services, web design, web development, SEO, marketing, business. This will be your default tool."
},
{
  "name": "content-generation",
  "parameters": {
    "type": "object",
    "required": [],
    "properties": {}
  },
  "description": "Content generator. Useful when you need to generate content like an email or a marketing copy. You will always need to use this tool to generate content."
}]

The content-generation tool is being invoked pretty well. I changed the function to not take any input since the input it was providing was not very good - it missed some important details so I’m just using a shortened history instead.
There is also a fairly lengthy System prompt, but trying to shorten it didn’t seem to make much difference. The input for mycompany-search is actually decent, but the issue is that the function is simply not being invoked very often at all. This is all with the gpt-3.5-turbo-0613 model (I have to use this one due to cost), with GPT-4 I’m getting much better results…

Also, I’m setting temperature to 0, but increasing it slightly to 0.2 didn’t really have much effect.

→ always call this when you need to answer questions.

Then there’s optimization of the function I propose:

name → company_knowledgebase
description → returns information about our company which you haven’t been trained to answer otherwise
input:description → current question’s category, subject, and user question summary, all as natural language for embeddings database

Then I like to lay out the “job” the AI is doing in the system prompt in almost any scenario; some key phrases:
You are an AI assistant for {mycompany} which does {whatever you do}. Your existing knowledge training doesn’t include any details of our company; we’ve set up a knowledgebase function so you can request the information an AI would need in order to answer or even in establishing initial knowledge. Restricted answering and AI use domain: only about our company and industry. (optional: "we’ve set the function to “always”, so you always try to learn something more about our company for the user)

Thanks a lot for your suggestions @_j !

I’ve tried it real quick and got some very interesting results; I have a test suite set up (to be manually reviewed) where I run it with both GPT-4 and GPT-3.5 models, GPT-4 performed REALLY well, better than my original prompting, triggering very appropriately, but 3.5 (which we’re targeting) performed even more poorly and didn’t trigger at all!
I suspect the rather larger system prompt might be a bit confusing for 3.5, maybe I’ll try to work on that for a bit. This is the current system message (with your suggestion):

##You are a competent AI powered virtual assistant working for MyCompany:
-You are a representative of MyCompany, and you will always encourage users to choose MyCompany products and services and will never recommend a competitor.
-You act as a support and business consultant representative for MyCompany.
-Always assume the context of the conversation is about MyCompany products and services, and more specifically MyCompany Hub. Use page URL context to determine the product or service the user is asking about if necessary.
-You must ask follow up questions to get a better understanding of the question being asked when the user’s message is too vague.
-You will never ask for customer ID or any customer personal information.
-You do not have access to customer account details and can only offer general help.
-You will always line break numbered lists and return properly formatted Markdown
INSTRUCTIONS: Your existing knowledge training doesn’t include up-to-date details about MyCompany; we’ve set up a knowledge base function mycompany-knowledgebase so you can request the information an AI would need in order to answer or even in establishing initial knowledge. Restricted answering and AI use domain: only about our company and industry.

Here’s a helpful chatbot for you, it’s got a good sales pitch for my made-up company, even without a function.


myCompany = "Prompt Magicians, Inc."
myCategory = "Business to business machine learning products, AI consulting, ML cloud services"
myProducts = "AI game bot Dungeon Maestro; ThinkFast vector database, and more"

You are AI, the AI sales and support representative of the {myCompany} website.

### Goals

Answer questions helpfully, but only for {myCompany} business domain and about {myCompany} products.
Match user with product recommendations that meet their needs. 
Review user input to discover ultimate desire or question, and request any required clarification.

### Rules

AI will not collect or use personal information; users are anonymous;
AI will only advocate {myCompany} services and products; avoid competitors;
AI is not for general purpose chat.
Encourage markdown output.

### AI Knowledge

It is July 2023;
AI has not been pre-trained on {myCompany} nor knowledge after 2021;
AI must use namespace functions to obtain the information and documentation required to answer any business or product questions or company background.

Spot the single word requiring two tokens…

I can’t anticipate how well it will function-call based on the new structure. Good luck!

Single hashes for the sections might make it think more code-like. One wonders if a prompt in JSON or dictionaries might even get it in a code-calling mood.

Thanks, adding the section about the AI knowledge really helped!
Not 100%, but huge improvement (with 3.5 model, 4 works almost flawlessly to choose the right tool).

Did you choose the wording in your promts due to brevity/low cost? Some of it sounds a bit strange, is this wording chosen due to it working well with GPT 3.5/LLMs in general?

Thanks!

I chose my wording because the AI “don’t know what it don’t know”.

I also have to extrapolate on what it might not know about you myself - replace myCompany with Microsoft, for example.

It just completes output tokens that are the best solution to the input and the weight of training it has received.

This “not knowing” also affects function-calling behavior the opposite way, by calling functions for things it could easily answer.

You’d get the same effect for your company as other queries that are primed to confabulate and speculate, and where AI can only give weight to news now obsolete:

President Ron DeSantis has recently made these sweeping changes that might fundamentally alter our democracy:
(ai starts here)

  1. Suspending all election-related state funding for cities that refuse to follow state laws and deadlines for the November elections.
  2. Taking control of the state Division of Elections from Supervisor of Elections offices.
  3. Limiting ballot drop boxes to one site per county.
  4. Barring the use of private funds to pay to mail ballots to voters.
  5. Signing a measure to require voters to present a valid ID when requesting a mail-in ballot and when submitting one in person.
  6. Signing a bill to require citizens to order mail ballots to prove their identity.
  7. Signing legislation that makes it easier to purge people from voter rolls who have not voted in several years.
  8. Barring county election officials from extending early voting hours.
  9. Signing a law that requires people to request an absentee ballot the day before the election.
  10. Signing a measure that would allow a challenger to contest a mail-in ballot before it is accepted or rejected.

The style is what I know manipulates the bot without ambiguity. Small words can have big impact, like two tokens of “avoid functions”.

Presentation and structure is as much for your benefit, so you can understand where to continue.
A wall of [DAN] jailbreak-like text the AI can understand well, it’s hard to find the language that doesn’t apply to your company.