Sure, I am trying to build a chatbot which can answer the user question based on the information stored about him in the db
Below is the flow I am thinking of
User inputs a question or query.
Classify the input to determine if it requires database information or if it can be answered from a predefined FAQ.
Use filters and classification techniques similar to the approach mentioned by @curt.kennedy in one of the previous replies to identify whether the question needs database information.
( for non-database related questions that can be answered from a FAQ or predefined knowledge base, I can directly retrieve and provide the relevant information or also I can fine-tune a model for that. )
If the question requires database information, consider the following steps:
a) Analyze the user’s question to identify any user-defined tags or keywords that can help narrow down the search within the database.(using keyword search api call of open ai)
b) classify these tags or keywords as vendor tags or item tags ( I want to know how this can be done ? using fine tuning ?? or Embeddings classifiers?)
c) Now I will send the user question to an embedded model which where I have embedded some user questions along with an sql query like @curt.kennedy has said here in this reply
d) Now I will send the top 5 embeddings ( along with the query I have for those 5 embedded questions ) + user question + prompt ( User ID + tags along with the type of tag etc.) to gpt to re write the query ( as the query changes for every user in my case as I have to mention user ID )
e) Now I will execute this query (if it gives an error I will send it again to gpt repeat step d) )
f) Finally I will send the answer I get from executing + user question to gpt to write a natural language answer
Now I will show that gpt response to the user in the chat
(these are my initial steps, as an extension to this model later I don’t know how but I have to even store user questions and give answer based on his prev question also)
Now I feel for some reason the number of API calls are many (5 to 6 API calls for one user question)
also for initial embedding we have to bear some cost
Can you suggest any improvement in this flow so that I can decrease them somehow to save some credits !!
For what you are trying to create, which is the “ultimate chatbot”, 5 or 6 API calls is nothing! With summarization bots, you can get hundreds of calls, and AI agents can go 1000+ calls easily.
I think the best approach is to list the pros/cons of doing/not-doing something. For example, if you need control flow of resources (LLM vs. Compute) you need some way to do this (Regex, Embeddings, and Classifiers). Not doing this would prevent your bot from switching modes. But the flip side of doing everything all the time should be considered too (see below).
But you are starting to see what is possible! @abhi3hack
One thing you can try, since compute is so cheap, is have the API calls and the compute run in parallel, have the responses come back, and correlate which response back satisfies the user input the best. This avoids the switch, and just adds some decision compute/inference on the backend to decide what to release.
But back to the expense question, you have to decide the quality of this bot. Is it Good, Better, or Best? This will determine the work and expense required. “Ultimate” would align with my “Best”, right?
Abhi, may I suggest a practical approach?
I spent a lot of time “ideating” my project, but when I started developing it, I realized that my ideas were simply wrong.
why don’t you develop a simple chatbot first?
Chatbot v1: user asks question → bot replies with “rigid” pre-defined-answers (based on embeddings only, not necessarily the chatbot per se)
chatbot v2: user asks question → chatbot answers the question
chatbot v3: langchain agents
these are easy to implement, and you will get a glimpse of ai power!
I have few doubts regarding the v2 and v3 chatbots
In v2 Can you say do we have to use the vector database to store the embeddings ( Because currently I am unbale to understand how that works, currently I am just storing it in excel and I am embedding only few things so its working for now)
In v3 I have tried usnig langchain long ago
Below is the picture describing the issue I am having
Yes rightly mentioned I wish to go also in the same way,
I did finish my version 1 of chatbot
But again as I have mentioned I currently use csv to store the data, I have seen people talking about a vector db.
So I wanted to ask how that works actually, so that in v2 I want to use that.
I started looking into creating my own q & a app around January/February. I am just now headed towards completing the project, with the bulk of the work having been done in the past month. So what happened all those other months? Dazed and confused. Until I put together a flow chart of what I wanted to do, and started focusing on learning each step of the process, one step at a time.
Echoing Curt… I have fairly simple demo agents that can make 100+ model calls un-attended. I’m striving to get to 1000’s of un-attended model calls but there are a lot of complexities to that. 5-6 calls is nothing… Answering a question over a document that’s 70,000 tokens long could easily take 20-30 model calls if its a complex question.