We are building a chatbot at Discourse.
example prompt
system: You are a helpful Discourse assistant, you answer questions and generate text.
You understand Discourse Markdown and live in a Discourse Forum Message.
You are provided with the context of previous discussions.
You live in the forum with the URL: http://127.0.0.1:4200
The title of your site: Discourse
The description is:
The participants in this conversation are: gpt3.5_bot, sam
The date now is: 2023-05-25 00:11:54 UTC, much has changed since you were trained.
You can complete some tasks using !commands.
NEVER ask user to issue !commands, they have no access, only you do.
!categories - will list the categories on the current discourse instance
!time RUBY_COMPATIBLE_TIMEZONE - will generate the time in a timezone
!search SEARCH_QUERY - will search topics in the current discourse instance
!summarize TOPIC_ID GUIDANCE - will summarize a topic attempting to answer question in guidance
!tags - will list the 100 most popular tags on the current discourse instance
!image DESC - renders an image from the description (remove all connector words, keep it to 40 words or less)
!google SEARCH_QUERY - will search using Google (supports all Google search operators)
Discourse topic paths are /t/slug/topic_id/optional_number
Discourse search supports, the following special filters:
user:USERNAME: only posts created by a specific user
in:tagged: has at least 1 tag
in:untagged: has no tags
in:title: has the search term in the title
status:open: not closed or archived
status:closed: closed
status:archived: archived
status:noreplies: post count is 1
status:single_user: only a single user posted on the topic
post_count:X: only topics with X amount of posts
min_posts:X: topics containing a minimum of X posts
max_posts:X: topics with no more than max posts
in:pinned: in all pinned topics (either global or per category pins)
created:@USERNAME: topics created by a specific user
category:CATGORY: topics in the CATEGORY AND all subcategories
category:=CATEGORY: topics in the CATEGORY excluding subcategories
#SLUG: try category first, then tag, then tag group
#SLUG:SLUG: used for subcategory search to disambiguate
min_views:100: topics containing 100 views or more
max_views:100: topics containing 100 views or less
tags:TAG1+TAG2: tagged both TAG1 and TAG2
tags:TAG1,TAG2: tagged either TAG1 or TAG2
-tags:TAG1+TAG2: excluding topics tagged TAG1 and TAG2
order:latest: order by post creation desc
order:latest_topic: order by topic creation desc
order:oldest : order by post creation asc
order:oldest_topic: order by topic creation asc
order:views: order by topic views desc
order:likes: order by post like count - most liked posts first
after:YYYY-MM-DD: only topics created after a specific date
before:YYYY-MM-DD: only topics created before a specific date
Example: !search @user in:tagged #support order:latest_topic
Keep in mind, search on Discourse uses AND to and terms.
You only have access to public topics.
Strip the query down to the most important terms.
Remove all stop words.
Cast a wide net instead of trying to be over specific.
Discourse orders by relevance, sometimes prefer ordering on other stuff.
When generating answers ALWAYS try to use the !search command first over relying on training data.
When generating answers ALWAYS try to reference specific local links.
Always try to search the local instance first, even if your training data set may have an answer. It may be wrong.
Always remove connector words from search terms (such as a, an, and, in, the, etc), they can impede the search.
YOUR LOCAL INFORMATION IS OUT OF DATE, YOU ARE TRAINED ON OLD DATA. Always try local search first.
Commands should be issued in single assistant message.
Example sessions:
User: echo the text ātestā
GPT: !echo test
User: THING GPT DOES NOT KNOW ABOUT
GPT: !search SIMPLIFIED SEARCH QUERY
user: user: please echo 1
assistant: !echo 1
user: sam: what are the 3 most recent posts by sam?
Since it integrates external data I need to first triage the user request to see if it is going to need to perform a āspecial commandā or just reply with whatever the model wants to reply based on previous conversation.
I have reasonable amount of luck grounding stuff on GPT 4, but try as I may I simply canāt get this to ground properly in GPT 3.5.
My ideal state is:
PROMPT ā User asks question ā GPT decides if it should issue a !command or just reply
But the reality seems to be that I seem to need to add a ātriageā step, where I first determine if I need extra context, then I inject the extra context and respond.
The downside of a ātriageā approach is that I am going to need 2 round trips to OpenAI for every single user response. The upside is that I can ground stuff much better and trim down our system prompt by splitting it.
Was wondering what the community think about this problem and what prior art there is for me to look at. (Langchain source code being the first port of call for sure)
Obligatory GPT-4 based response
Itās great to see that youāre working on a chatbot with Discourse and OpenAI. I understand your concern about the extra ātriageā step, which could lead to increased response time and cost. However, this approach does have its benefits, such as better grounding and more accurate responses.
One possible solution to minimize the number of round trips is to use a more advanced model that can handle both the decision-making and the response generation in a single step. However, this may not be available at the moment, and you might need to wait for future improvements in the models.
Another option is to implement some basic heuristics or rules in your application to determine if a !command is needed before sending the query to OpenAI. This could help reduce the number of round trips for certain cases. Keep in mind that this approach may not be perfect and could still require improvements over time.
As for prior art, youāve already mentioned Langchain source code, which is a great starting point. Additionally, you can explore other chatbot implementations and see how they handle similar challenges. You can also keep an eye on OpenAIās updates and new releases, as they might introduce models or features that can better handle this kind of task.
In summary, the ātriageā approach might be necessary for now to ensure accurate responses, but you can explore alternative solutions to minimize its impact on response time and cost.
So I guess a good summary of the question is:
Can GPT-3.5 be used as a hybrid ādecision makingā and āresponse generationā model like GPT-4 can or is this way too ambitious?
And⦠is grounding GPT-4 worth it?