task = "Help me summarize the article abstract I gave along with keywords"
text = "In this paper, .... "
content1 = [{"role": "system", "content": task},{"role": "user", "content": text}]
content2 = [{"role": "system", "content": ""},{"role": "user", "content": task + "\n###\n" + text + "\n###"}]
Do these two writing methods have the same effect?
Or will gpt give more weight to focus on my task by writing it in system content, like what I wrote for content1?
What is the mechanism of gptâs focus on my task?
As far as I know at the moment, the user/assistant have more weight than the system message, but thatâs not guaranteed to always be the case as ChatML is changing/evolving/improving.
I would try both maybe a half dozen times and compare/contrast to see the difference, keeping the other parameters the same.
You can also try adding it this way
SYSTEM MESSAGE
USER: MAde up user example
ASSISTANT: Made up assistant example
USER: The userâs input
etc, etc. Experimentation is useful for tools this new.
I find that when I enter a particular character personality into system, as the conversation gets longer and longer, gpt becomes more oblivious to the charactor I give in system.
I canât quite understand the significance of âsystemâ role being played in my conversation, maybe I havenât used it right?
I remember a thread from someone who was trying to keep the assistant in character, and how they got better results using davinci than gpt-3.5-turbo. They later realized that message order apparently mattered. If the personality was always appended in a system message as the very last message, it would mostly adhere to it. Maybe that helps.
Although the user message is âheavierâ than the system message, itâs good practice to use the system message for a couple reasons
Reverse engineering
â A user can retrieve your database and understand your process easily by simply asking what they just said. If itâs sent through the system message, GPT is more relutant to admit that it even exists
Good Practice
â As mentioned, they are continuously training GPT to give more power to the system messages. Itâs better to adapt to a âweakerâ version, and make it work. Knowing that it will continuously get better. Rather than focus on âhackingâ it in the user role and deal with progressively worse results.
Modularity
â Itâs much easier to navigate and manipulate the conversation with business logic when your injected messages can be identified by their system role. Rather than bundled in with the user role
I donât know how to categorize the last one, or have any proof, but GPT recognizes system messages as truthful logs, and insights. Considering that we are usually roleplaying with GPT to a certain extent, having a natural, structured, non-hacky roles (imo) must benefit the cohesiveness of the conversation.
I have never had any issues with using the system message.