I might consider sending the first response back in with a different system message that says (paraphrasing): “you evaluate responses to make sure they’re within scope and do… if not.” And then provide the response along with your prior system message (all as one string for your second prompt) and ask it to evaluate the response to scope.
The availability, the 99% uptime Service Level Agreement, and no max usage. Its a quite a bit faster as well, that is more of a feeling, I haven’t run numbers on how much faster it is, but it is.
That is correct, there is no API difference, well except maybe the functions API, I haven’t checked that yet. But the service level stuff, like speed and availability, are all quite different. It costs the same, you can just try it and swap out your endpoint.
In my tests I found the GPT35 was at least 50% slower on average on azure.
But there is also a difference in terms.
While openai has strict rules. Azure expects ethical use.
So some things not allowed at openai may be allowed on azure api.
Plus rate limit for GPT-4 is higher at openai and the 16k context is not available on azure yet (didn’t check today though).
The approach I found working is as follows:
I ask chatGPT to rate the user message on a scale from 1 to 10 according to the relatedness to the context and keep the rating for itself and not mention it in the answer, then answers only if the rating is above a certain threshold (7 for example)
You say there is a difference in speed, but there is no documentation or benchmarks to confirm this. no one is saying anything like that. Purpose API and azure openai are running on the same servers. What is your source or test finding to confirm this knowledge?
Well you can just use it. It’s not a scientific experiment, it’s a quality of engineering problem. Sorry, but I’m not going to have that type of data. I don’t even write papers for thoughts, strategies, and frameworks I have let alone internet speeds and response times.
So you can just use it, or you can continue using OpenAI’s. I’m not here to convince you one way or another. Just a suggestion so you can have a better product.
I’m using the following which seems to be working well:
# Define the system message and message history
pSystemMessage = "You are an AI assistant that answers questions factually based on the information:"+source_identifier+" provided to you. When answering you will give detailed answers along with relevant related information contained in the information:"+source_identifier
# Create a new user message for each grouped answer group
user_message = source_identifier+":\n" + source_text
# Initialize the message history for the grouped answer group
pMessageHistory = [
{"role": "user", "content": "The following "+embeddingType+":"+source_identifier+ " is to be used when answering subsequent questions. If the answer to the question is not contained in "+embeddingType+":"+source_identifier+ ", answer with Sorry."},
{"role": "assistant", "content": "I will use the "+embeddingType+":"+source_identifier+ " you supply to answer subsequent questions factually and in detail along with related information."},
{"role": "user", "content": user_message},
{"role": "assistant", "content": "Please ask your question and I will answer based on the "+embeddingType+":"+source_identifier+ " you have supplied.If the answer to the question is not contained in the "+embeddingType+":"+source_identifier+ ", I will reply with Sorry."},
]
Have a look here
Thank you for sharing! My quick test tells me this is working well! It looks like you can limit conversation and also pivot in right directions. Nice tip this one.
Hey @caos30,
First of all, thank you so much for the post prompt idea; it’s working perfectly fine for me. However, I want to improve it further. Let’s say if I add $post_prompt just after the user’s message, which is “Don’t give information not mentioned in the CONTEXT INFORMATION,” and the user asks to “retune this” and fix the grammar mistakes + {$post_prompt}, in this scenario, the post prompt will automatically get added. As expected, the language model (LLM) will reply with “Please refrain from answering this query if it is out of context.” However, this has nothing to do with the context that I provided in the system message.
I’ve tested this in your application, Beehelp, and it’s working excellently. The reply is something like, “I’m not sure I understand. Please try asking in a different way about our services. ” I believe your app is also using post prompts. If I ask your Beehelp, “Retune this and fix grammar,” then the Post Prompt should automatically get added at the end. But based on my testing, your app is working perfectly fine and not fulfilling these types of requests. Can you tell me how it’s achieving this?
Hello @siddhartha.01 !
In fact, I’ve “abandoned” the use of “processed answers” by ChatGPT in BeeHelp, and I’m now only applying “retrieval” of the closest FAQs using embeddings.
I made this decision because the hallucination of ChatGPT’s processed answers was not compatible with the minimum quality level I need for such a service. For example, someone asked about “support channels,” and ChatGPT answered, “OBVIOUSLY, you can also contact us at our email address support@beehelp.net”!!!??? And let me say to you that in the CONTEXT INFORMATION, I never talked to them about any email support address
So, sincerely, the “ability” of ChatGPT to hallucinate forced me to abandon its processed answers
The response message you saw (“I’m not sure I understand. Please try asking differently about our services. ”) has a “trick” inside … let me explain: if the cosine similarity between the user’s question and ANY of the stored FAQs (CONTEXT INFORMATION) is LOWER THAN a threshold (now it’s set at 0.79), then it responds with that pre-defined sentence
This works like a charm!! Especially to prevent visitors from asking bizarre things (things unrelated to your business). It works as a FILTER.
Sincerely, today my major concern about using ChatGPT is the low reliability due to hallucination. In some scenarios, this is too problematic.
Those kinds of tricks really make my weeks.
How does it perform on long context?
Summary created by AI.
In this discussion, members of the OpenAI community are sharing their experiences and solutions for controlling the responses of the GPT-3.5-turbo model to keep it within specific context provided or prevent it from generating irrelevant or fabricated information. caos30 struggled with these issues in developing a chatbot for BeeHelp.net. After three weeks of adjusting the prompt provided to the API with the system role, they found an effective solution was to add two instructions to the API call: “Don’t justify your answers. Don’t give information not mentioned in the CONTEXT INFORMATION.”
While the post-prompt solution helped significantly, caos30 reported that the model still occasionally hallucinated information not contained in the context. AgusPG suggested a multi-prong approach to addressing this issue, including prompt engineering, context manipulation, and post-response filtering. AgusPG also suggested a detailed system message to guide the bot’s behavior and adherence to a set of principles.
Coding solutions were also proposed, with codie sharing their technique of using a quorum of reasoning to guide the bot’s responses, involving several levels of checks including categories of inquiry, policy violations, and tailored response strategy. There were also mentions of using embeddings for matching questions to data sources by louis030195, data-specific response systems by alden, and methods to keep the bot’s responses within the given context by cdonvd0s. The discussion noted the existence of individual variations between responses from different versions of models, with gpt-4 reportedly being better than turbo at following system instructions.
Summarized with AI on Dec 2 2023
AI used: gpt-4-32k