How to provide tokens that my system can replace

I would like to know if it is possible to provide chat-gpt with a list of tokens that my system knows how to use.

Our use case is that we have an interface that allows user to create email templates. Inside of these templates we allow the user to use special tokens that automatically get replaced with data in our system. For example a token may look llike this {FirstName} and wherever that token appears in the email text it will be replaced with the first name of the user that is stored in the database.

We have integrated with chat-gpt so our customers can provide a prompt to chat-gpt and it will create an email template for them. We would like to extend this capability to have chat-gpt be aware of these tokens that our system is able to use.

Two ways,

  1. Use a fine-tuned model.
  2. Prompt it with essentially a codebook of placeholders and possibly a one-shot example.

Alternately, you could get the user’s information first and inject that into your prompt so the model knows, for example, the {FirstName} it should use.

2 Likes

Thank you for your response. I’m new to working with AI so I need a little bit more guidance.

  1. I’m not sure how I can fine tune the model to be aware of these tokens. Suggestions or examples would be really helpful.
  2. I’m not sure what a one-shot example is. Are you suggesting sending all of the tokens and a description as a system or user message inside of a chat request?

It is not possible to know the token values ahead of time. These emails are templates that get sent when an event occurs (like when a user signs up for something or makes a payment) therefore the email template is created long before the token value is known.

You likely don’t mean ChatGPT, you mean the OpenAI API.

You don’t need “tokens”. It will understand a pythonic list. Just something like:

<###> Form

POWER OF ATTORNEY

KNOW ALL MEN BY THESE PRESENTS, that I, [Principal’s Full Name], residing at [Principal’s Address], hereby appoint and constitute [Agent’s Full Name], residing at [Agent’s Address], as my true and lawful attorney-in-fact (hereinafter referred to as the “Agent”), to act on my behalf and to represent my interests in the matters described herein.

SCOPE OF AUTHORITY: I grant my Agent full power and authority to act on my behalf in the following matters:

a) Legal and Financial Matters: To make, execute, acknowledge, and deliver any and all contracts, agreements, deeds, bonds, mortgages, notes, checks, and other instruments or documents necessary or appropriate in connection with my legal and financial affairs, including but not limited to the buying, selling, leasing, or encumbering of real estate or other property, the borrowing or lending of money, the establishment, modification, or termination of any banking or investment accounts, and the filing of tax returns.

b) Personal and Health Care Matters: To make decisions concerning my personal and health care matters, including but not limited to the selection of medical treatment, consultation with medical professionals, admission to or discharge from any hospital, nursing home, or other medical facility, and the giving or withholding of consent to any medical or surgical procedures.

c) Other Matters: To perform any other act or acts that I could personally perform if I were present and acting on my own behalf.

DURATION OF AUTHORITY: This Power of Attorney shall become effective immediately upon its execution and shall remain in full force and effect until [termination_date]. However, this Power of Attorney shall not be affected by my subsequent disability or incapacity, unless expressly revoked in writing.

THIRD-PARTY RELIANCE: Any third party dealing…

<###> Instruction
Complete and output the form, populating with this user-provided information:

Because the bot can’t follow instructions, you’ll have to articulate the form exactly. An attempt to specify random types of letters of the AI’s own creation results in garbage:

Create a two-weeks notice letter for quitting a job. Insert this user data where appropriate [“employee name”:“Bob Generic”, “Supervisor”:“Joe Overblow”, “Company”:“Widgets, Inc”, “Reason”:“Found new employment”,“Pet Name”:“Rover”, “Pronoun”:“He/Him”, “Status”: “Married”, “Hair”: “Blonde”] and do not fabricate or leave any information to be filled in later.

ChatGPT:

Bob Generic
123 Main Street
Anytown, USA 12345
bob.generic@email.com
(555) 123-4567
June 21, 2023

Joe Overblow
Widgets, Inc
456 Elm Street
Anytown, USA 12345

Dear Joe Overblow,

I am writing to formally tender my resignation from my position as an employee at Widgets, Inc, effective July 5, 2023.

There’s really no need to use an AI for such things though. It doesn’t take artificial intelligence to fill in a form with known fields.

You are not understanding what I am requesting.

My platform has a feature where admin users can setup email TEMPLATES. These email TEMPLATES may contain special tokens that get replaced with real data from our platform. This allows our admin users to create a template like this:

Greetings {UserName}, Thank you for signing up to our platform

and the {UserName} token will be replaced with the actual new users username by my system when that email is sent. I am trying to find a way to make the chat endpoint be aware of the tokens that are available in my system.

I do not want the chat endpoint to fill out any forms for the user, nor do I want it to actually put real user data into the emails. These emails are TEMPLATES to be used over and over again in my system.

You can look at this post here for some ideas then for calling functions with AI-gathered data:

The “weather” demonstration of functions in API shows a field that must be filled by the AI, particularly of the choice Celsius/Fahrenheit, and another “location” (that doesn’t state explicitly-enough what is allowed). You can state that all parameters are mandatory and must be user-provided data in the function description to ensure they are filled and not hallucinated on.

I’m not seeing particularly the use-case or optimum solution where someone needs to be chatting with an AI, and then it decides on its own it can send a mail-merge email via function. Providing it a list of all the forms, or a function to get the forms list, or then pass how the requested one works as a new function API call, details of when to use them and its job of being a template assistant is a lot of prompt and chain to follow, where putting the right data into a function call is not the biggest problem.

The AI is not sending any emails, it is generating the content for the emails. Using functions is not the answer since there is no code being written or executed.

Okay, now you have me baffled here…

Who is the AI talking to? Is it a chatbot? What does it do? What is the desired generation output? When is the output intercepted and not sent to the user?

You say “we have an interface that allows user to create email templates”

You say “there is no code being written or executed” - you mean like no code that would send an email?

You say (the AI) “is generating the content for the emails” but it has no form to go off of?

I will take this statement at face value: “I am trying to find a way to make the chat endpoint be aware of the tokens that are available in my system.” example, {UserName}

The endpoint can’t be “aware” of things. The AI model can when it generates, based on the input you give it.

You can tell the AI in the system prompt "we’ve retrieved this information about the user from our database: [“User Name”:“Bob Generic”, “Referred by”:“Joe Overblow”, “Company”:“Widgets, Inc”, “Status”:“Monthly paid subscription”,“Phone”:“911-611-3853”, “Pronoun”:“He/Him”, “Status”: “Married”, “Hair”: “Blonde”]

Then it needs a way to act on that information, besides now being more informed about the user. You can see above in my example that with no form and telling it to generate some email or other letter, you will likely get generic and uncompleted form letter field names regardless of what you directly tell it, and also hallucinated info. It would take multi-shot examples equivalent to just giving it the form to use, or tons of fine-tune examples that barely train it how to answer a particular question to attempt to skirt these issues.

There’s no built-in way to do this, but you can probably get pretty close with a good system message or modified user message

2 Likes

Thank you @novaphil

You seem to be understanding what I am asking for. I can do this approach but it could be heavy handed. We have a lot of tokens these admins can use so that could drastically increase our Open AI token usage. Especially since I am using the chat endpoint and allowing the user to “respond” to what the ai generates.

I was really hoping I could feed my tokens to an endpoint once and the ai just “remembers” that those are available.

First, my friend, please slow your roll. I understand you are frustrated, but whether you intended it or not, this post comes across as very aggressive.

This is a community of your fellow users, any answers or advice we give is just that—given—so please remember that when interacting with your fellow users.

Second, if people have misunderstood you, then it’s incumbent on you to try to be more clear.

Remember, you are asking for help. It is in your best interest to reduce the barrier others need to cross in order to provide that help. Make it frictionless.

NOTE: A side effect of writing a great help request is that it is essentially the same process as writing a great prompt.

So, basically what you want to do is build a GPT-powered email template generator. You have an existing system which can take a specially formatted email template and personalize it with information in your database.

With that in mind, I already have you the only two answers.

  1. Use a fine-tuned model. You (presumably) already have a large corpus of email templates your users have created. Start there. Take all of those templates and write prompts which you would expect would generate those same prompts. Use this as your fine-tuning data. You’ll need at least 500 examples, but 1,000–2,000 examples will yield substantially better results. If you don’t have that many examples, you can create “synthetic data” using the process in my second answer.
  2. Prompt it with essentially a codebook of placeholders and possibly a one-shot example. Basically you write a prompt asking for an email template as you would expect your users to do. Then tell the model something along the lines of “Here is an example:” and provide it with another prompt (one you wrote for an existing template) and the existing template. That’s a “one-shot in-context learning” example. If you give it two examples, that would be a “two-shot” example. You generally get better results the more examples you give, but there are diminishing returns.

So, if you have enough data you can just fine-tune a model—though you’d need to acquire prompts for each template. If you don’t have enough data you can just handpick a couple of great templates to inject into the context every time for few-shot examples. The downside to this is that it consumes a lot of extra tokens.

If it were me, and I wanted to make the best product possible here’s what I would do.

  1. I would build and curate a corpus of 2,000–5,000 amazing prompt:response pairs for fine-tuning.
  2. I would build a vector database of embeddings from all of those email templates and their prompts.
  3. I’d do a semantic search on the vector database to find two or three prompt:response pairs which are most similar to what the user is requesting and inject them into the context of the fine-tuned model as few-shot learning examples.

Then, as users continue to make and use new templates, I would keep adding them to the vector database.

If you wanted to get really fancy you could build a fine-tuned classifier to rate the quality of the templates that come out of the model so you can cull the data you’re fine-tuning on to only include the best-of-the-best examples. Ultimately building a GAN-like system where your classifier gets harsher and stricter the better your generator gets at creating templates.

Pour a few tens of thousands of dollars into it and you’ll likely come away with a near-flawless system of generating near-perfect email templates the first time, every time.

It is really easy
We are doing it every day on more than 10k api calls

You should just structure your system prompt with very clear instructions between xml tags

One of them should be

<existing_variables>
You may use the following variables in your answer.
{company}
{phone}

</existing_variables>

And in part of your prompt
Add
Never use not allowed existing variables

With gpt4
Temperature.7
Top_p 1
Always work