Currently, I am running a prompt where the goal is to generate queries based on the user’s natural language question. I am using a prompt template chain from LangChain and I’m not specifying any role for this chain. I am doing it as follows:
prompt_template = “”"
You are a SQL expert with strong knowledge of financial market terminology. Your task is to generate SQL queries based on user questions written in natural language.
…
“”"
I need to understand if there’s any difference in specifying the ‘system’ role separately from the prompt I’m running, or if putting the instructions together in the same prompt has the same effect, considering that this is not something intended to generate a conversation
Roles are to prevent prompt injection. So your users couldn’t get your LLM to give outputs or run functions outside of normal operation.
You probably don’t need LangChain for this. You can pull this off with simple prompting and I definitely wouldn’t add bells and whistles like that until you feel confident in traditional prompts.
Thanks for the explanation! Just to confirm — is preventing prompt injection the only difference between specifying a system role and including the same instructions directly in the prompt text? Or are there other behavioral or functional differences I should be aware of?
Additionally, considering that my goal is not to generate a conversation between the user and the prompt, will adding a system role make much difference?
System scopes the context. For example, you can ask the model to “act like a Math professor” and then when users key in their prompt, they do not have to repeat that. The output will be scoped with the generated output given as something a professional Math learned person would possibly say, the lingo, the concepts etc.
Just thinking aloud from my experience and understanding. Please correct me if I am wrong.
Can I then conclude that, in the case of a prompt where there is no conversation, setting a system role beforehand or invoking a single prompt saying what it is makes no difference?
Well, there is a difference. The way I see it, if users have to include in their prompt, telling the model to “act in a certain manner”, then by having to type this repeatedly with every prompt, then you are expending extra “input tokens”.
That’s my thoughts. I may be wrong. But at least for me, that feels logical.
Hope to hear and learn from the others.
The AI models are post-trained by OpenAI on following the chat message format. The attention and prediction is attuned to that pattern.
That means having some system message in the style of “You are ChatGPT, an AI assistant”, at a minimum.
Having an AI entity that can answer questions is a bit unnatural for a completions engine, and is its own type of “instruct” prompt engineering and tuning to make language generation behave as such.
The chat models you thus can call “chat trained”. They will work best with the identity, purpose, domain of tasks, response output all specified in the system message, and then the user message barks the commands at it.
Neither turn-based tasks in the system message nor permanent behaviors in the user message are ideal.
Usually for data processing and non-interactive prompting, the system instructions are the things that stay the same throughout your project, eg instructions and output format. Then, the dynamic content you want procecced will be in the user message. The intended use is usually to have both a system and user message, so this is a common way to go about that.
You can also get away with placing your entire prompt in the system or user message. I notice that at least with earlier models, skipping the system message made the model more likely to ignore your output format, add extra dialogue, and turn into ChatGPT, which wasn’t desirable. Later models should be better about this, but you may still observe some nuance.
“You are a SQL expert with strong knowledge of financial market terminology. Your task is to generate SQL queries based on user questions written in natural language.”
It’s more complicated than just getting the prompt template right.
Are you providing the answers or is the GPT supposed to search the web? Which dialect of SQL?
If you are targeting a specific dialect then provide basic syntax the GPT can reference that is baked into the manifest.
Do you have a dictionary of financial market terms that users will use Ditto.
How are you going to validate that the GPT gives the correct answer?
You can give the prompt template instructions such as:
-Prioritize completeness over speed.
-Provide source citations for your answers.
Then you need some way to verify the answers. I’m using a relational database for the verification side so that I can run unit tests where I supply prompts to the GPT where I already know the answer.