Evaluating a finetuned model

aryamanpattnayak7 · November 29, 2023, 4:33pm

Hello,

I finetuned my first ever model, and I need to test it against its prepared test dataset, and I have zero idea how. Could someone point me in the right direction?

sbaldino · November 29, 2023, 4:45pm

Hi,

how do you use the API in general?

In your interface, you should have the list of your finetuned models. Each model will have an unique name. Just copy-paste it and tell the API to use it. As an example, in python, you would call it as

response = client.chat.completions.create(
  model="ft:gpt-3.5-turbo:my-org:custom_suffix:id",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"}
  ]
)

As you can see, it is sufficient to specify it as ‘model’ in your response definition.

By the way, that code was taken from the link at the end. Be sure to read it
https://platform.openai.com/docs/guides/fine-tuning/use-a-fine-tuned-model

aryamanpattnayak7 · November 29, 2023, 5:31pm

sorry if i was not clear. I meant scores like-

Test Set	Dev Set
weighted Acc (%)	weighted Acc (%)

is there a way to obtain scores like those? afaik it used to be supported by wandb but does not now.

sbaldino · November 29, 2023, 6:18pm

Oh, I get what you’re asking.

If you want to do it with new data, you have to devise a test: a way to understand if an output for your model is right or wrong (usually by knowing what the output should be in known cases).

If I remember correctly, while finetuning your model (and also after) you can follow the accuracy on the train set from the same page of the openai website. In the fine tuning interface, when you select a model you can track its training loss. If you also uploaded a test/validation set, you can also see the loss on the test set. I unfortunately don’t know how OpenAI defines its loss functions, so I can’t help you on reproducing that.

If you mean to use the fine tuning interface to upload a new dataset and get the loss on that dataset, while it would be an interesting feature, I’m afraid it is impossible with the interface that we have today.

Topic		Replies	Views
How to test fine-tuned model API	3	2256	April 2, 2023
How to know accuracy from finetuned gpt4 model API api	4	1355	November 29, 2023
Evaluation of Fine-Tune Model API	4	2324	February 17, 2023
Fine tuned model does not exist! API gpt-35 , fine-tuning , api	12	4993	December 9, 2023
Questions about fine-tuning GPT-3.5-turbo API fine-tuning	1	2125	October 29, 2023

Evaluating a finetuned model

Related topics