Under the section “Preparing Your Dataset” and subsection “Uploading a training file,” the code for Create a fine-tuning job with DPO is displayed. This seems to be an error, as this code is kind of a repeat of code shown in a future section, and showing how to actually upload a file makes more sense here.
Agreed. Both Python and Node show dpo parameters without previous mention in the document. Training “Marv” examples (on a system message that already does the job) doesn’t lead one to that code sample.
hyperparameters is now deprecated (one discovers), now placed within a method parameter.
The API reference demonstrates code for both.
Straightforward examples of both would be better, including all hyperparameters at their defaults for documentation. Such as
from openai import OpenAI
client = OpenAI()
job_response = client.fine_tuning.jobs.create(
training_file="file-my_jsonl_id",
validation_file="file-my_held_out_set_id", # optional
model="gpt-4o-mini",
suffix="marv2" # optional
seed=123456
method={
"type": "supervised", # vs "dpo", direct preference optimization
"supervised": {
"hyperparameters": {
"n_epochs": 2, # number of cycles through file, a cost multiplier
"learning_rate_multiplier": 0.9, # scalar away from default 1.0
"batch_size": 10, # examples per batch
}
}
}
)
job_id = job_response.id
job_status = job_response.status