I am using GPT-3.5-turbo api and i want to enforce that the responses are always written in Greek language. Sometimes, when the response is long, i get English characters instead of Greek. Is there any way i can make sure i only get Greek? Can i somehow utilise the stop_sequence parameter? Already using a pretty low (0.1) temperature.
One option could be to use the System role like this: System: Follow these five instructions below in all your responses: System: 1. Use Greek language only; System: 2. Use Greek alphabet whenever possible; System: 3. Do not use English except in programming languages if any; System: 4. Avoid the Latin alphabet whenever possible; System: 5. Translate any other language to the Greek language whenever possible.
Play with these instructions until you get satisfactory responses.
Pay attention to these details: the word “five” in the first line matches the number of instructions, itemize (better numbered) instructions and examples one per line, punctuation as separator “;” and “.” in the last line. Be concise but descriptive with the instructions. Such details are also applied to the User prompt.
Maybe it is also good to add a 6th instruction referring to some character set of your choice: System: 6. Use [...] character set whenever possible.
You can also alter the behavior in the prompt, for example: User: Ignore instruction 3 in the System role. Do this...
You can set the temperature to a more “creative” level.
I hope this helps, please advise the results.
Thanks for the advice about punctuation you gave me in that “jailbreak” thread. I am using it and added it to all my suggestions. It was so obvious and logical in natural language processing that I didn’t notice.
BTW, i am using the api to create a chatbot. I have implemented basic session handling a la Bing, so the user needs to reset after a few rounds of conversation. I initialise (and re-initialise for each new conversation cycle) with a certain prompt that describes the bot’s duties. You think i should define the system role as you described only in the initialisation step or for every prompt?
The system role is persistent for a conversation. It is a context reinforcement for the conversation, it imposes limits to the model as a kind of memory for a specific conversation.
We can understand the User prompts as context variations inside the System context boundaries - except if the user gives an order (from the prompt) to change the behavior of the model to ignore the System role. The System role shall be applied for every new conversation, it contains the repetitive rules to avoid repetition in the User prompts in one conversation.
In my implementation so far, I have been providing context (it is a specialized chat bot for a certain type of job) for the chat bot through the prompt using the Content role. So each time I initiate the session I give an initial prompt that describes what the bot supposedly can and can’t do. Do you think I should move this to the System role as well?
As far as I know, there is no “Content role” in OpenAI API - the only ones are:
User role;
Assistant role;
System role;
While “content role” is not a specific role in the OpenAI API, it can be a useful concept created by the developer, when thinking about the purpose of different parts of the content in a broader context.
I don’t know what is in your “content role” - but if it is something that must be kept as instructions, rules, or limits during the conversation it could be good to move to the System role.
Sorry for the confusion. I provide the "context info in the user role each time I initiate the session. It is a multi line string which contains instructions and a few examples. I will look into moving this to the system role. Thanks again for your valuable help.
That’s OK. As I mentioned earlier, try to stick to those details - one instruction or example per line, itemization, numbered whenever possible, and punctuation. Those details help the model to be organized - it is an obedient smart child that wants to please its teacher.
Keeping the context across consecutive prompts is something difficult. The model “forgets” frequently, or a prompt is not fully aligned with a previous prompt, and so on.
Our human conversation always assumes that the other side reserves a minimum of memory for a coherent debate. But I’m sure you know some people who jump from one topic to another especially when they feel pressured.
LLM models are designed for everyone, including these people - they are ready for a quick change of context.
To keep the model within a context, use the System role - this way you can keep the User prompt concise and objective.
If the User prompt strays too far from the context described in the System role, the model will try to adapt to the new prompt - requiring a longer prompt to contain more details and rules for the new context.
You really nailed it when you mentioned people jumping from topic to topic especially when feeling pressure.
I will attempt to build my context with the system role exclusively and hopefully I can force it to speak only Greek.
BTW, killing the session after a few (8 atm) attempts really helps.
There, I posted 14 tips or how-tos to deal with OpenAI API and the models in a general way - may be theoretical, and too long to read. When you have some time, please check if it could be useful for you in any way.
While reading the official OpenAI gpt-3.5-turbo api document for chat completions, I came across this:
“In general, gpt-3.5-turbo-0301 does not pay strong attention to the system message, and therefore important instructions are often better placed in a user message.”
Is OpenAI downplaying the system role value or is it truly not that good at setting context? Maybe a hybrid approach of feeding context to the model via both system and user roles would be preferable?
The statement in the OpenAI documentation you mentioned is referring to the behavior of the specific gpt-3.5-turbo-0301 model (almost?) exclusively. It is true that this model may not pay strong attention to the system message, but it doesn’t necessarily mean that using a System message to maintain context is not helpful - it may depend on the specific use case, the nature of the conversation, and the behavior of the model being used.
As far as I know, using the system message strategically to maintain context is still valid in general, even though there may be specific models that behave differently. Different models may have different strengths and weaknesses, and experimentation may be necessary for a particular use case.
Maybe in your case, a hybrid approach is successful - but at a cost: sometimes a gigantic user prompts - if you think that a chatbot as you described can work with such a prompt size, then you can apply. It’s difficult to advise about every exception of OpenAI with such scarce documentation, no support, and without taking a look at your code.
Most of what we advised is taken from our own experience from this forum. Some like me are used to compiling all suggestions, experiences, and results according to our own needs - meaning it works for me but I am not sure if everything we compiled here works for you. Large Language Models are a grey area yet, we all are beta testers.
Thanks for getting back, it really stuck me as peculiar, what they are saying. I am gonna try to use the system role and see how it goes. I really appreciate your input, seems like I was walking in the dark before.
You are just like everybody else here, including myself - a bit lost. As I said, we are all beta testers here, with no feedback about what we are testing.
There are a few threads that I wrote about cases similar to yours. But I can’t remember them all - I suggest searching the forum. Some of us can find solutions, but we need feedback to inform if the solutions are working or not - and not many users do that.
Sincerely, I don’t think that even the OpenAI staff have their projects under their full control. We can realize that when we read the documentation - and in this AI field? Full of surprises, nothing is as expected - no, nobody is controlling this. We all are guided by probabilities - it is fuzzy logic, not the boolean logic and computers as we use to know.