Specialized Chatbot with GPT-3

Hello,

I’d like to make a chatbot which specialized in a specific area. (or i want chatgpt to act like an expert of something)

Needs;
1- Chatbot is going to have a name.
2- Chatbot is going to restricted in a specific area. For example, Chatbot is about cars. So if I ask a question about coffee price at starbucks it is not supposed to answer this question.
3- Chatbot is going to ask questions about context.

I am planing to use langchain and chatgpt.

My Plan;
1- Apply embedings to PDF documents.
2- Start with greeting message.
3- Get a question from a user. Then, apply embeding to given question to check if it is in my PDF embedings. Get the first 2-3 results. Compare the similarity probabilities with a threshold (here, the purpose is checking if the given question is about my documents context)
4- If the given question is in embeding context, then give it to LLM (chatgpt). If not, response with a message tell something like “your question is out of my context sorry.”

My Questions;
1- I do not know exactly restrict the LLM(chatgpt).
2- I do not know how to make chatbot ask questions follows the context.
3- Can I do the things without embeddings?

Do you have any idea to guide me?
Thank you

1 Like

Your goals are not satisfied by your plan.

Unless you are adding new knowledge, you can simply system prompt an API-based chatbot:

You are “Motorhead”, the AI assistant of the “pistons and sprockets” website, which is for car enthusiasts. You will only answer questions about automobiles or automotive technology, all other knowledge domains or AI use is to be politely denied. Examine closely the user input: if the question is unclear or could be phrased better, ask for clarification about details of the question instead of answering.

You can also use or supplement with what is called multi-shot training, where you also provide user/ai exchanges as if the user had asked them in chat history:

user: what kind of fuels do jet engines take
assistant: I’m sorry, I only talk about cars. Do you have automotive questions?
user: Can I put jet fuel in my Mustang?
assistant: No, you cannot put jet fuel in your 2015 Ford Mustang. Jet fuel is a type of kerosene and is not suitable for use in a gasoline or diesel engine.

@_j Thank you for your time to answer my question.
I actually confused a bit.

I have seen some videos always use embedding to find related content with the use input. So, this approach seems to me querying something from db. It is not chatting.

If I use prompt and openai chat. I lose memory.

I don’t know where to start :sweat_smile:

A “chatbot” is defined by its ability to maintain situational and contextual awareness by its memory. Yes, a single API call has no memory.

“Does my car have fuel injectors” cannot be answered unless the AI knows what has been previously asked, like “I have a Tesla Model 3”. OpenAI AI models do not have their own memory system.

The way we make a chatbot, and the way ChatGPT works, is by having the software remember previous conversational exchanges, and by assembling a new API call that gives some of the most recent human/ai interactions before presenting the most recent user input, so the AI can know what you are talking about. This doesn’t take an external database unless you want features like saving the conversation for continuing a different day.

If you find that the users are asking about specs for 2024 cars and the AI doesn’t know, you can then add a vector database with your own new knowledge, and can feed some relevant information to the AI in order for it to answer better. A “PDF” is not a native format; you’d use actual tagged text as an information source.

1 Like

best to look at the documentation here. there’s even code examples there for this.

Depending on how big the PDF is, you may not even need to do the embedding and searching.
You can just simply say you must only answer using the following information: [paste PDF text].

1 Like

It’s understandable to feel a bit confused with all the different approaches. Let’s break it down.

If you want your chatbot to have memory and be able to provide contextually relevant responses, you can use a combination of system prompts and embeddings-based retrieval. The system prompt sets the behavior and context for the conversation, while the embeddings-based retrieval allow you to provide specific examples of how the chatbot should respond in different situations.

For example, you can start with a system prompt like this:

You are “Motorhead”, the AI assistant of the “pistons and sprockets” website, which is for car enthusiasts.
You will only answer questions about automobiles or automotive technology, all other knowledge domains or AI use is to be politely denied.
Examine closely the user input: if the question is unclear or could be phrased better, ask for clarification about details of the question instead of answering.

Then, you can provide embeddings-based retrieval as if they were part of a chat history:

User: What kind of fuels do automobile engines take?

Background process retrieves semantically similar content by embedding the User query and comparing to relevant existing content that was previously embedded.

The results from the embeddings-based retrieval can be appended to the system prompt like this:

Please utilize the following information in your response to the user:
>>>INSERT CONTENT from embeddings-based retrieval<<<

This will help ground your chatbots responses in the knowledge that’s part of your embeddings-based retrieval system.

Good luck with your chatbot,
Brian :palm_tree:

4 Likes

It would be best to create a chain with langchain.
Then use LLMRouterChain to use multiple prompt branch and use embeddings if necessary.

Thank you all.

Embeddings is a good option to give system information and context.

When using the Openai API you have roles (system, user, assistant) for diferent things (see: OpenAI Platform). Mastering these features can also be useful for your project.

1 Like