Fine Tuning with escaped strings

I am trying to fine tune a model, but all fine tuned data is written in a string. Example: “Hello”. If I wanted to add like " to the fine tune data. Is my only option to unescape the json string? Do I have to use ", which also includes that I want to fine tune like next lines and etc. gpt-3.5-turbo is a fine-tuned version of gpt-3. So OpenAI must have done it so other way without unescaping the json string.

You have cleverly outsmarted the forum by not putting your escaped text within backticks to make it preformatted_text.

{ "tip": "Here's the \"truth\":\nYou can and must escape quotes (\") and triples(\"\"\")" }

2 Likes

Does the fine-tune preparation automatically do that or do I have to do it myself?

You will have to escape quotes within strings as well as linefeeds. A quote that is not escaped in a JSON will have the effect of closing the string at that point. The text that follows would then be viewed as invalid JSON.

You can also put the whole set of jsonl (JSON list, although it is JSONs separated by lines and not a pythonic list) into a python script and run it. It will produce no errors if correct, but throw a syntax error on bad strings.

PS C:\Users\user\Documents\chat> .\teststrings.py
  File "C:\Users\user\Documents\chat\teststrings.py", line 3
    {"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers." Give or take a few, like that really matters."}]}
                                                                                                                                                                                                                                    ^
SyntaxError: invalid syntax