Fine-tuning API for domain specific use case with slightly larger prompts

qaiserabbas · September 25, 2023, 2:36pm

I am planning to fine-tune a base gpt3.5 model for the following research based use case. My prompt and completion can get long and its not a Q&A based fine-tuning use case.

A sample from my data might look like following.

data = [{

"prompt": "Write a detailed plan using following information {information about a person would be provided like age, likes and dislikes + more depending upon availability of information }",

"completion": "This will include human expert generated plan tailored specifically to meet the requirements in the provided prompt."

}]

My question is about the training size. As an ML engineer I know that more data will be highly desirable for better performance. But in my use case I am not able to collect and capture the all of variations in data that would be used for fine-tuning. I might be able to get 200 to 300 samples but that would be specific to age groups, likes and dislikes etc. Capturing all the variations in the data is currently not possible and impractical for now. I am afraid that model might not generalize well to other age groups and person specific attributes.

Before I proceed to experiment with finetuning the model, I would love to hear if someone has achieved something similar. If I can get some intuition then that would be much helpful for me to collect data. Thanks in advance and looking forward to hear some good suggestions from developers.

P.S: This data is in Prompt Completion format but I would definitely use Chat Completion format.

Topic		Replies	Views
Fine-Tuning with Non-Prompt/Completion Data: Seeking Advice for Direct Text-Based Training? API gpt-4 , chatgpt , fine-tuning , api	3	394	August 23, 2024
Fine-tuning a model without using prompt-completion API fine-tuning	1	911	July 4, 2023
Finetuning for shortening prompts Documentation fine-tuning	10	3777	December 24, 2023
Prepare data for fine-tuning a story generator Prompting fine-tuning	4	2016	October 20, 2023
Prompt construction with a new fine-tunings model Prompting gpt-35-turbo , fine-tuning , prompt-engineering	10	1315	May 22, 2024

Fine-tuning API for domain specific use case with slightly larger prompts

Related topics