I need help buildning a chatbot on my own data

pontus.lennartsson · August 24, 2023, 1:36pm

Hi!

Im buildning a chat bot that should answer question about articles.
So im finetuning the api with all the articles?
How should my fine tune json look like? I want the bot to only answer questions from data in the articles nothing else.

novaphil · August 24, 2023, 1:41pm

Chat over your own data is best handled with embeddings and a vector database. Fine-tuning is better used for changing the structure or way that GPT response.

See the Embeddings Guide and cookbooks for how to get started.

pontus.lennartsson · August 24, 2023, 1:50pm

There are a lot of articles/documents in my database.
I dont want to send the whole database everytime someone ask a question about any articles?

novaphil · August 24, 2023, 1:56pm

You don’t send for every request. First you go through and create chunks of your articles and create embeddings for each chunk. Those are stored in your database or in a vector database (with metadata of what article they belong to).

On user query, you create embedding of the query, run similarity search on your vector database (this is built-in function on any common vector database), and add the resulting chunks to your GPT prompt.

If an article is ever updated, you generate new embeddings and replace them in the vector database.

A vector database isn’t strictly required, you could store the embeddings in the database you currently use and write your own code to do the similarity search, or see if your database has a vector extension/plugin.

paul.redcell · August 24, 2023, 9:42pm

Hi Pontus,
Novaphil is exactly right. Read through the links he posted above and you will be on your way. Once you have the embeddings down and understand them, if your bot isn’t replying in the ‘manner’ you want, like the wrong tone or you want more on-point responses, then look to fine-tuning with the questions and answers from your documents.

You will have to send all the questions and responses back to the API each time but just for the current conversation. There are ways to handle that too if it gets to large.

Good luck and have fun with it.
Paul

Topic		Replies	Views
How do I make a client chatbot using fine-tuning with gpt-3? API fine-tuning , api	4	3564	December 17, 2023
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2260	August 20, 2023
Finu-tuning on my website data API fine-tuning	7	1740	October 17, 2023
Fine-tuning with 3.5 turbo or gpt4 API gpt-4	8	3008	May 24, 2023
Fine Tuning a Chatbot to provide answers from a specific dataset API	6	4128	December 17, 2023

I need help buildning a chatbot on my own data

Related topics