Best approach to build a chatbot for a specific content

What will be the best approach in terms of model, api, finetuning/other to use openai for the following use case:
A chatbot embedded in an existing SAAS that will answer specific questions on the customer data
For example:
User: can you share the logs of my account logins?
Answer: [the relevant log]

User: what is my subscription plan?
Answer: your subscription plan is basic, 20$/month

etc…

My followup question will be how can I tailor the model for generic approach on reading customer data and then for each specific customer?

Thanks!

2 Likes

So you have some tables, ex: logs, users_subscriptionplans

I think there is many ways to do it. Here is an idea:

User: what is my subscription plan?
Answer: your subscription plan is basic, 20$/month

  1. You could fine-tune a model like davinci-003 or the new one turbo to do something like this :

Example: what is my subscription plan?
Output: users_subscriptionplans

Example: what kind of enrollment do I have?
Output: users_subscriptionplans

…basically, you would refer a question to a table.

  1. Getting back the answer, ‘users_subscriptionplans’ and knowing the user that is logged, you could retrieve the subscriptionplan in the table of the user. You can then return the content to the user.

You cannot do this with just finetuning. You need to integrate search and retrieval as well.

2 Likes

Turbo is not yet available for fine-tuning, but most of this would need to go via embeddings, as far as I understand. I believe it is similar to what you have explained, but am not sure if you are just using standard SQL queries to look for data in the table or do it in a different way?

Perhaps you didn’t understand the idea. I will post an example below. Also, if you have another way, it could be useful to provide it.

Ok so I’ve attached a picture so you can understand the flow.

I’ve applied this technique in real life, which is allowing me to convert Speech-to-text and create programmatically entries in the database with structured data.

I am joining a playground example as well : OpenAI API

@pinardalec thanks thats a great example.
To expand this use case - how would you fine tune the model to be able to answer based on entry on the database and not just the name of the variable? for example:
User: is there any user on my account under the name “John”?
Answer: there are 2 users under the name John, John Cena and John Travolta

User: did someone log in to my account in the past week?
Answer: Yes, total 5 login attempts in the last week

User: can you elaborate:
Answer: Sure, here is the log of all logins last week:
Time1 User1
Time2 User2
etc…

And in general, will you describe this approach as "fine tuning divinci model? the part I am missing here is how to feed the model with the data (such as logins and users) so the model can answer questions based on the data

Hi,

I don’t have much time, I draft something quickly so you can have an idea and improve it.

PREPARATION

  1. Open an acocunt on Pinecone.IO, create yourself an index to store some informations.
  2. Call openai ada-002 embedding with the prompt of your model (ex: ‘log’)
  3. Store the returned embedded string (vectors) in your Pinecone index, including metadata (table + columns)

PRODUCTION

Snippet playground: OpenAI API