Hello, I want to create a JSONL file for my dataset. I will be using gpt-3.5-turbo. and I want to upload my file for fine tuning. I have a doc file which contains data in this format -
Example -
{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”},
{“role”: “user”, “content”: “Hi, What is the capital of Germany?”},
{“role”: “assistant”, “content”: “Berlin”}]}
{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”},
{“role”: “user”, “content”: “What is the capital of India”},
{“role”: “assistant”, “content”: “New Delhi”}]}
My question is, how can I create JSONL file from this doc file. Is there any online converter that I can use?I tried one but it doesnt work well. Also, Why do I need to convert it? I can also just save it as jsonl extension because it already contains the data in json format.
Another question is, in the OpenAI documentation, It says the JSONL format is like below but its for babbage-002
and davinci-002
models. Also in other chats in this forum I saw other people are generating file in this format.
{“prompt”: “”, “completion”: “”}
but what is the format for gpt-3.5-turbo? In my opinion, this is the format for gpt-3.5-turbo
**{“messages”: [{“role”: “system”, “content”: “Hello, This is a Test Chatbot”}, **
**{“role”: “user”, “content”: “Hi, What is the capital of Germany?”}, **
{“role”: “assistant”, “content”: “Berlin”}]}