This is a “bug” easily solved but I think it is worth mention it here.
I’m using gpt-3.5-0125 to translate some product descriptions, and I found this odd behaviour:
If I insert the delimiter “-----” in the middle of my input text, the model outputs the translation from text after the delimiter. The issue is easily solved by using other delimiter, such as “####”.
You can try to reproduce the following in the playground. I got the same behaviour (gpt-3.5-turbp-0125 with temp=1 and top=1):
System:
You are fluent in many languages and can translate items' descriptions into a variety of languages, namely into European Portuguese. You are also familiar with the Harmonized System (HS) codes that are used to classify products for customs purposes.
The user will present you a description of a product that was classified under a specific HS code.
Your task is to provide a faithful translation of the product description into European Portuguese. Translate the full description, keeping the original format as much as possible, namely the punctuation and the structure of the sentences, paragraphs and break lines.
You answer the user by writing the translation only, nothing more, in the following format:
TRANSLATION:
<translation>
User input:
Regulile generale pentru interpretarea Nomenclaturii combinate: 1 și 6.
Nota 3 de la Secţiunea XVII.
NESA de la pozitia 8708 pct. i si j.
-----
Produsul reprezintă o piesă metalică de formă cilindrică, cu o sectiune cilindrică goala la interior, din aliaj de aluminiu EN AC-44300 DIN EN 1706.
Reprezintă o parte a unei piese de tip “ansamblu” cu rol antivibraţii (AVS – “anti vibration system”) al unui autovehicul și care asigură partea care preia vibraţiile/şocurile, ȋn general ȋntre elemente de caroserie şi compartimentul motor.
Caracteristici: ȋnălţimea de 66.5 mm, suprafaţa ȋn vedere frontală are lungimea 142.4 mm şi laţimea 146.4 mm, diametru exterior de Ø 110 mm, diametru interior de Ø 100 mm. Pe una din feţe piesa metalică are patru urechi prevăzute cu locaş frezat Ø 8.5 mm.
Procesul tehnologic de obţinere a piesei este cel de extrudare a profilului de aluminiu, tăiere (debitare la lungime) şi prelucrare prin aşchiere a locaşului.
the use of delimiters for splitting tasks or content is a common prompting technique. If you anticipate your user text to contain delimiters of its own, you’d be advised to create stronger delimiters around the user generated text.
Of course, the strongest delimiters in the chat models are the messages themselves. You could use that.
Instructions in the user input are a good idea. The AI training is task-driven by performing what the user wants. Step-by-step instruction-following in gpt-3.5-turbo system messages was tanked hard.
You can also enclose the entire text in a container of triple-quotes, a bunch of square brackets, triple-backticks, etc, to emphasize what part of input is data.
Translate to Portuguese:
[[[[
{document}
]]]]
A very common use is the three hyphens of a markdown line. It is seen in a lot of knowledge scraped from the web, emphasizing meaning. This is useful for other needs for separation:
document:
Produsul reprezintă o piesă metalică de formă cilindrică
---
Task:
Check spelling and improve quality.
Ok, I just tried a new version where I put everything under between a xml tag () and it works better (also updated the system prompt, accordingly). No add behaviour.
@_j , sorry, not sure if I understood the “tanked hard” comment. You are saying that for 3.5-turbo, instruction in system prompt can degrade performance?
gpt-3.5-turbo-0613 used to be a lot better at complex multi-step tasks in the system “programming”. It could be told that raw user input was only data to be processed and did a decent job of not seeing anything then as instruction to it. The model with the same name was altered in quality.
Containers are important to ensure that data is not “followed” as instructions.
Having worked with GPT many many times before, I can already tell you that this won’t work.
This only prevents the laziest of jailbreakers. Give people a couple of tries and it’s generating dark humor instead of translating.
You can’t win this game At least not with a single request.
The same thing with people trying to protect the prompts of their GPTs. It’s simply not doable due to the way AIs work.
Perhaps in the future private documentation will have secret easter eggs with some kind of instruction to the LLM to notify for copyright infringement or something like that.
As that’s not really ever a valid input, just regex it out of input before you send.
Won’t stop all attacks, but there’s things you can do to help.
I trained a small Ada model to “identify” possible attempts then would swap out the user’s input with flowers or unicorns or something lol… Or smarter would be to just deny or send an “Oops, we had a problem on our end, sorry!” Which makes them think there’s no system in place and it’s just bad coding on your part…
ChatML helped a ton, and I’m sure it’ll get more difficult as the LLMs get larger…