Can I create a Q&A chatbot by uploading text and transcipts of FAQs

Hi. My question is: I have an online course, with multiple coaching Q&A calls. Many of these questions are repeated a lot, so I want to create a chatbot to answer to this FAQs. The thing is, I trying to find out if I can upload transcripts of the recordings of these calls, in order to teach the chatbot to answer questions based on the Q&As from these recordings. Is it possible? Has anyone found a way to do it or already did it?

2 Likes

Hi Aleksander, that’s a great question!

The Answers endpoint is absolutely perfect for this, which allows you to upload a JSONL File with documents that you want to query. You can then integrate this into a chatbot builder (or build your own bot) to create an FAQ-answering chatbot.

I hope that helps, and please let me know if you have any questions along the way.

I would counter that the Answers endpoint is imperfect for a Chat Bot. For each question, you would need to Search for relevant Documents with the Search endpoint, then pass the found Documents to the Answers endpoint.

We could run into problems pretty quickly. Imagine a printer with many buttons. If you hold down the Menu button for 3 seconds causes a test page to be printed, while pressing the Power button for 3 seconds causes the printer to reboot.

In order to set up for this, we might want to upload the documentation split into paragraphs - since on the call to the Answers endpoint, we’ll be charged for all the tokens in the documents referenced, the tokens in the question, and the tokens in the answer.

Here’s how I imagine the chat might go:

AI: Hello, how can I help you?
Customer: Which button do I use to enter the printer’s menu?
** Search call: "Which button do I use to enter the printer’s menu?
** Search results: [array of documents]
** Answers call: [array of documents] “Which button do I use to enter the printer’s menu?”
AI: To enter the printer’s menu, you press the Menu button.
Customer: What happens if I hold it down for three seconds?
** Search call: “What happens if I hold it down for three seconds?”
** Search results: [array of documents]
** Answers call: [array of documents]
AI: Holding it down for three seconds will cause the printer to reboot or print a test page.

But why would I imagine that? Well, the problem as I see it is that the second question does not contain sufficient context, and context can’t be passed on from one question to the next. So the second question is a completely new set of state as far as GPT-3 is concerned. “It” doesn’t have any meaning in the new context, and so the documents collected will not relate solely to the Menu button, but in fact to anything that features “three seconds” - perhaps some that feature “seconds” and some that feature “it”.

I am always happy to be proven wrong, however.

Hi Hugh,

To use the Answers endpoint, no additional endpoint is needed. In other words, you don’t need to use the Search endpoint and then the Answers endpoint. The Answers endpoint automatically ranks and sorts documents and then finds the answer in the highest ranked document.

You could try providing examples and playing with the parameters, such as increasing the response length, to ensure that you, at least, get an answer that is more specific and not potentially mis-leading (e.g. Instead: “If you hold down the Menu button for 3 seconds causes a test page to be printed, while pressing the Power button for 3 seconds causes the printer to reboot.”)

Thanks for the input, hugh. Have you tried implementing something like this and if so, what were the results?

By the way, my course has different coaches, usually with different opinions on the same topics, and I was wondering if I could set this up in a way that the chatbot provides multiple answers to one questions, reflecting the different viewpoints of the coaches. I was thinking something like:

Question:

How to do XXX?

Chatbot answer:

Coach [name 1] suggests yyy

Coach [name 2] suggests zzz

Coach [name 3] suggests xyz

etc.

Joey, has the behaviour changed? I was told I would need to upload the documents and search them with a max rank, then pass them separately to the Answers endpoint.

For Aleksander’s use case, your proposal may require rewriting the FAQs for accuracy, although grouping them by subject might work too - the integrator (OpenAI’s customer) just needs to be cognizant of the number of tokens they’re going to be billed for in that circumstance.

Regarding continuity, if one were to supply the previous 5 Questions & Answers and the new Question, would only the last Question receive an Answer?

Hi Hugh, the Answers endpoint is actually fairly new, and it’s basically a one-step process. There used to be a method of ranking documents and then finding an answer in separate steps, but that’s no longer needed, given the Answers endpoint.

You’ll want to provide one question at a time.

It seems that your example isn’t exactly “multiple answers,” it’s just a single, more in-depth answer. That’s an important distinction, because it means a single call to the endpoint. You could try increasing response_length and providing examples similar to the behaviour you’re describing.