Hi: I am trying to put together a Q&A chatbot which can remember the “context” of the previous chat exchanges with a user, to provide relevant answers. Example:
User: Can you show me some Nike shoe models?
AI: Here you go! <Presents shoes 1, 2 and 3>
User: Cool. I like 1. How much does it cost
AI: Shows the cost of Shoe 1
User: What colors does it come in?
Now, GPT-3 has to be told (or made to understand) that the “it” in the last question refers to Shoe 1. What would be a good way to achieve this? Is “examples_context” of the Answers endpoint the way to go? If so, how exactly should I populate it.
Thanks in advance!
Thanks, that’s what I suspected. While using the answers endpoint, does it mean I need to keep updating the “examples” or “examples_context” with each query?
Here, since the chatbot is answering from an existing knowledge-base, I felt it’d be better to use the answers endpoint. So, I cannot use prompt.
It’ll be nice if a “context” can be “set” with each query, like how DialogFlow does. In DialogFlow, once you set a context, if you pass the context in the next query, the engine remembers the last few Q&A that belonged to that context.
Hi: some things to unpack from what you said:
Perhaps you meant the reverse, i.e. use the completions endpoint for determining general intent, followed by the answers endpoint for domain-specific Q&A? Reason being that the completions end-point has access to GPT-3’s vast repository of general knowledge. The answers endpoint doesn’t seem to have that - its restricted to searching within the documents/files supplied
Regarding “searching the Nike site” (even though its just an example), are there any native GPT-3 utilities for this or should we build our own scraper/parser? I am assuming its the latter
No constraint as such, other than cost. DialogFlow has a per-message fee. GPT-3’s is content based (tokens). But if I am going to go with DialogFlow for Q&A and if the knowledge-base is fixed, I might not need GPT-3 for that use-case at all. Because DialogFlow offers native utilities for PDF uploads. So, with a one-time training effort, it’ll be pretty self sufficient to answer natural language questions within an existing knowledge-base
We are internally debating point #3 - whether for Q&A within a knowledge-base, which is better - DF or GPT-3.
My initial thoughts are that GPT-3 is more optimized towards text-generation & general-knowledge Q&A, rather than a knowledge-based Q&A where context-memory is important. Even their pricing reflects that. Happy to be corrected
You could try using the API to route Questions with a certain amount of confidence to the object context (Nike). Sort of like routing customer service tickets to departments. So if you ask about shoe colors, it is only considering answers in the Nike context, or department.
Hi, I actually had the same issue and solved it this way:
- Build a .txt file with content, divided by ###
- Use ‘semantic search’ to find the relevant content for each query
- Copy the content and add it to your prompt as an inherent part
- Build a log of your last 3 interactions
- Add the last 3 interactions after every modified prompt, but only if it still the same prompt.
- Whenever a new query has a high score in the semantic search, it means that the user has changed the subject. Delete the last three interactions and start again.
The results: you can talk with the bot on different topics, and it stays aligned. Another cool outcome is that it has quite big “memory,” long-term and short-term, saving quite a bit of tokens.
Thanks @NSY. I had a few follow-up questions:
Which endpoints did you use? In step 2, you mentioned “semantic search” - so I presume its a Search endpoint. So, do you upload the text file as a .jsonl format?
When you say “prompt”, once again, are you referring to the Search endpoint or Completions? Because the former has only “query” and “documents”, not prompts
Hi: will take a look at the tutorials and give it a try.
Thanks for the details!
Can you please elaborate point 6, possibly with an example? Which scorer you are talking about and what would be considered high?
I am currently in communication with the support team about this feature when using the new embedding API.
In the “old” semantic search/classification function, you could get a 1-100 score for the certainty level of the search. When it provided a number less than, say, 25, we desmised the question with an “I don’t know” reply instead of providing an answer.
I haven’t observed a similar ability with the new embedding methodology, but that may be due to the nature of the solution.
Tag me, if you find out. I was looking at cosine values and the values were a bit arbitrary. For something missing I saw a value as high as 0.91 but for something else which was definitely there, the value was 0.5.