I was using the Edit mode to clean up some text taken from PDFs, and noticed the result always replaced the “§” symbol with “$”. I tried a bunch of other symbols from extended ASCII, but this is the only one that’s changed. Even when explicitly including “Do not remove or replace any non-space characters” in the instruction, the result had the swap.
Is “§” a super-special character for GPT, or does this happen a lot with other characters? If it’s just this character, it’d be easy to find and fix any incorrect replacements.
Edit 2023-05-15
Here are some examples from the playground. Settings:
- Edit mode
-
text-davinci-edit-001
model (from GPT 3 series) - Temperature of 0
- No stop sequences
- Top P is 1
Input:
If an organization is reg istered as a non - profit under §§1024.57, then it i s e ligible
Instructions 1:
Remove extra spaces inside and between words
Output 1:
If an organization is registered as a non-profit under $$1024.57, then it is eligible
Instructions 2:
Do nothing
Output 2:
If an organization is reg istered as a non - profit under $$1024.57, then it i s e ligible to
Instructions 3:
Do not change any part of the input. Do not add any characters. Do not replace any characters.
IMPORTANT: Keep the "§" as a "§"
Output 3:
If an organization is reg istered as a non - profit under $$1024.57, then it i s e ligible for a tax exemption
Bonus observation: The second and third sets of instructions show another unrequested behavior: the model wrote beyond the input despite being told not to. The input was a complete and grammatically correct sentence, but didn’t end with a period.