Creating JSONl File from doc file

Hello, I want to create a JSONL file for my dataset. I will be using gpt-3.5-turbo. and I want to upload my file for fine tuning. I have a doc file which contains data in this format -
Example -
{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”},
{“role”: “user”, “content”: “Hi, What is the capital of Germany?”},
{“role”: “assistant”, “content”: “Berlin”}]}

{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”},
{“role”: “user”, “content”: “What is the capital of India”},
{“role”: “assistant”, “content”: “New Delhi”}]}

My question is, how can I create JSONL file from this doc file. Is there any online converter that I can use?I tried one but it doesnt work well. Also, Why do I need to convert it? I can also just save it as jsonl extension because it already contains the data in json format.

Another question is, in the OpenAI documentation, It says the JSONL format is like below but its for babbage-002 and davinci-002 models. Also in other chats in this forum I saw other people are generating file in this format.
{“prompt”: “”, “completion”: “”}

but what is the format for gpt-3.5-turbo? In my opinion, this is the format for gpt-3.5-turbo
**{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”}, **
**{“role”: “user”, “content”: “Hi, What is the capital of Germany?”}, **
{“role”: “assistant”, “content”: “Berlin”}]}

Welcome to the Community!

The general logic of your data is correct. This is the official format as per the documentation and which is applicable to chat completion models.

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

In the final JSONL file, every training example should represent one line. Currently, you have line breaks in your data. This will cause issues. Hence, prior to converting it to a JSONL file, you need to remove these line breaks.

For JSONL file itself, you can just use a code editor such as Visual Studio Code. Once your line breaks are removed, you can just paste the data and save it as a JSONL file.