Can someone help me (with fine-tuning)

mattrosine · November 10, 2023, 1:44am

I’ve had no success fine-tuning a model since the outage. Right now I’m trying the fine-tuning web interface again but when I load both the training and validation files, I get this error in red:

There was an error uploading the file: Unexpected file format, expected either prompt/completion pairs or chat messages.

I am using the correctly formatted and prepared jsonl file, and why is it saying it expected prompt/completion pairs or chat messages? Thats not how the newest OpenAI documentation said data should be prepared which is using messages/system/user/assistant.

Please, for the love of god, can someone help me. I’ve now spent two days on this.

boyko11 · November 10, 2023, 2:45pm

Just guessing here, since I just started reading the docs, but I remembered this piece: " The conversational chat format is required to fine-tune gpt-3.5-turbo. For babbage-002 and davinci-002, you can follow the prompt completion pair format used for legacy fine-tuning". Maybe it’s the type of model?

mattrosine · November 13, 2023, 3:32pm

I’m fine-tuning gpt3.5-turbo and according to the documentation, the data is supposed to be in this format which is what I’ve done:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

I cannot get any help on this anywhere.

Foxalabs · November 13, 2023, 3:38pm

mattrosine:

{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "What's the capital of France?"}, {"role": "assistant", "content": "Paris, as if everyone doesn't know that already."}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}, {"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}]}
{"messages": [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "How far is the Moon from Earth?"}, {"role": "assistant", "content": "Around 384,400 kilometers. Give or take a few, like that really matters."}]}

What version of the OpenAI api library are you using?

nicole.silverman · November 21, 2023, 6:39pm

Mattrosine

Have you figured it out yet? If so please tell me and others in the future how to fix this issue.

a1045684012 · December 8, 2023, 4:29am

Perhaps you could share some samples of the dataset for everyone to see.

elleoleonvalencia · December 15, 2023, 4:47am

no puedes usar varias conversaciones en el mismo archivo, prueba una conversacion por archivo, ejemplo:

{“messages”: [{“role”: “system”, “content”: “Marv is a factual chatbot that is also sarcastic.”}, {“role”: “user”, “content”: “What’s the capital of France?”}, {“role”: “assistant”, “content”: “Paris, as if everyone doesn’t know that already.”}]}

en un archivo

edward4 · February 25, 2024, 1:28pm

Here are 3 rows in a JSONL file using the Prompt/Completion pair format, but it just will not accept my file. I even generated a simple sample file from ChatGPT and it won’t take.

{“Prompt”:“Compare deforestation trends over the past decade.”,“Completion”:“The answer is {A}”}
{“Prompt”:“Explore flexible pricing models for emerging businesses.”,“Completion”:“The answer is {B}”}
{“Prompt”:“Showcase success stories from businesses that have subscribed to GreenAnt.”,“Completion”:“The answer is {C}”}

jr.2509 · February 25, 2024, 1:54pm

Hi @edward4 - which model are you trying to fine-tune?

romaticbb · April 5, 2024, 6:35pm

Hello i am trying to crate a json file with rows expressing
name of cinema name
name of movie movie
time of Shows times
and so on one per row.
When i try to upload it i got “There was an error uploading the file: Unexpected file format, expected either prompt/completion pairs or chat messages.” do somebody know why and what i have to do?
There is a tool to validate the json file that was generate with an AI help i dont know why is not fine tuning really

jr.2509 · April 5, 2024, 8:09pm

Hi - could you share an actual example of your training data set? Have you ensured that the structure is consistent with the example provided here?

romaticbb · April 5, 2024, 8:15pm

I have the same problem.Uploading a training file i get an error about how the file is formatted but i did it as explained one per row very easy file substantially is

cine name name of cinema
show name of movie
price price
and so on

jr.2509 · April 5, 2024, 8:17pm

As indicated it would be easiest to do troubleshooting if you provided an actual example rather than just the logic.

Is your assistant message/output a JSON object?

kzainul22 · April 6, 2024, 10:32am

@mattrosine It should be in .jsonl format check for any line breaks for the same object line breaks should be avoided. And you must have atleast 10 examples for fine tuning.
Note => Only files with .jsonl format is allowed for now
https://platform.openai.com/docs/api-reference/fine-tuning/create#fine-tuning-create-training_file

Topic		Replies	Views
Help needed regarding Fine tuning API	3	584	April 6, 2024
Unable to Upload fine-tune file for gpt 3.5 turbo API fine-tuning	6	1851	December 15, 2023
An error occurred while processing file 'file-name' and it cannot be used for fine-tuning. Details may be available in the file's status_details API fine-tuning , fine-tuning-problems	6	1910	September 18, 2023
How to structure fine tuned data API	9	4730	December 23, 2023
Issues with JSON assistant message in fine-tuning API fine-tuning	12	543	October 7, 2024

Can someone help me (with fine-tuning)

Related topics