How to See the contents of OpenAI Fine Tuned Model Results in Python using the OpenAI API

I have fine-tuned a GPT-3 model for classification in Python. The model status still shows pending. However, the model events show that "Fine-tune succeeded. I even get an event showing the uploaded file id. When I download the results file using the following code and print its content, I see the following result:

content = openai.File.download("file-coumSxxxxxxxxxxxxxxxx")
contents = content.decode()


# Print the contents of the file
print(content)

Results:

b'{\n  "object": "file",\n  "id": "file-coumSxMyxxxxxxxxxxxxxxxx",\n  "purpose": "fine-tune-results",\n  "filename": "compiled_results.csv",\n  "bytes": 168364,\n  "created_at": 1673135506,\n  "status": "processed",\n  "status_details": null\n}\n'

Can somebody recommend how I can see the contents of the compiled_results.csv file?

I am also stuck because of the same problem. I have gone through the Open.AI documentation twice and referred to all the links W.R.T the topic. Also tried ChatGPT - No, answers. But I am sure that 1-month back openai.File.download() will directly download the Fine-Tuning results. OpenAI might have recently made some changes to that API.

Hi,
I think I found the answer. There are a couple of ways to get the fine-tuned result CSV. As far as I know 1. Python function (Like the one you have done), 2. Bash Commands, 3. API calls.
Using Python function and Bash commands I was not able to download the CSV file. Then I used a direct API call and it worked.
I use postman for API’s. GET https://api.openai.com/v1/files/file-xxxxxxxx/content.
Have Auth in the Headers. Hope it works for you.

hello, to get the file compiled_results.csv i used a GET request in POSTMAN. for me, it’s the best and easiest way to get the file.

using the ‘https://api.openai.com/v1/files/file-xxxxxxxx/content’ endpoint.
instead of ‘xxxxxxxxx’, you will place the ID of your fine-tuning, which can be obtained through the command (on colab, jupyter notebook or bash):

‘!openai api fine_tunes.results -i <model_fine_tuned_name>’

for auth: take your API key and place it in the POSTMAN authorization tab at: Authorization → Bearer Token → Token.
after sending the request you will be able to see the raw format file in the body of the request, to save it on your machine just click on ‘Save Response’

2 Likes

Hi, as of July 2nd, I can use your code to retrieve the content.

content = openai.File.download("file-coumSxxxxxxxxxxxxxxxx")
ans = content.decode()
with open('results.csv', 'w') as f:
    f.write(ans)

It works fine.

How to see the content in the newer version API?

content = client.files.content(result_file_id)

<openai._legacy_response.HttpxBinaryResponseContent object at 0x000002AA9FA7E960>

You can use one of the methods of the return object:

file = client.files.content(“file-8WxbhkRXPaPKsQNw4EwdQ3be”)
file.stream_to_file(“myfilename.txt”)

If you want to dump out binary bytes, you can use file.content. or file.text for a string.

You can only retrieve file contents on results files, such as those with purpose=‘assistants_output’. A list file call can get the file name the AI created.

Thanks for your reply But the content i got is something like below:

c3RlcCx0cmFpbl9sb3NzLHRyYWluX2FjY3VyYWN5LHZhbGlkX2xvc3MsdmFsaWRfbWVhbl90b2tlbl9hY2N1cmFjeQoxLDEuMDYxMDgsMC43NjE1OSwsCjIsMC41MDA5MiwwLjg3MjA1LCwKMywwLjkwODEzLDAuNzY5NDgsLAo0LDAuNzAzMDMsMC44MTA3MywsCjUsMC42MjIwNywwLjg0MDYyLCwKNiwwLjg0MTc3LDAuODA3NDMsLAo3LDAuNjA4MTQsMC44NDgzLCwKOCwwLjcyMDgzLDAuODQ4OTksLAo5LDAuNDg4OTEsMC44NzgwNSwsCjEwLDAuNzIzMzMsMC43OCwsCjExLDAuNTczNDksMC44NDc3NSwsCjEyLDAuNTcwOSwwLjgyODEyLCwKMTMsMC40ODExOSwwLjg3MDc1LCwKMTQsMC41MjY5OCwwLjg0NzY4LCwKMTUsMC41NDI2LDAuODQwODQsLAoxNiwwLjU1MTQ5LDAuODM2NzMsLAoxNywwLjQ5MTA2LDAuODcwMzEsLAoxOCwwLjU0NjU1LDAuODMyNzYsLAoxOSwwLjYxMTgsMC44Mzc5MiwsCjIwLDAuNjAyMjEsMC44MzA1NiwsCjIxLDAuNTExOTIsMC44NDczMSwsCjIyLDAuNTM2MTIsMC44NTYyMywsCjIzLDAuNTY0NTIsMC44MjkxMSwsCjI0LDAuNjU3MDcsMC44MjA4NSwsCjI1LDAuNTY3ODcsMC44NDE2MSwsCjI2LDAuNTQ1MTgsMC44NDA1MywsCjI3LDAuNjg4MjgsMC43OTg3LCwKMjgsMC43MDc1OCwwLjgwNDY0LCwKMjksMC40NTk1OCwwLjg4MjM1LCwKMzAsMC41MDkzMywwLjg1NDQzLCwKMzEsMC4zNjkzOCwwLjkyMDczLCwKMzIsMC40OTEzMiwwLjg0MjI3LCwKMzMsMC40NjIyMiwwLjg2NDg2LCwKMzQsMC41NTU4NSwwLjg1OTMzLCwKMzUsMC4zNzk1MywwLjg5MTE2LCwKMzYsMC40MjcyOSwwLjg5NDA0LCwKMzcsMC40MzA2NiwwLjg4ODU0LCwKMzgsMC40MTg0NCwwLjg5MjIyLCwKMzksMC41NjAyNiwwLjgyNjY3LCwKNDAsMC40NzI4NSwwLjg3MjIsLAo0MSwwLjQyOTAzLDAuODkzNzUsLAo0MiwwLjMzNzQsMC45MzYwMywsCjQzLDAuNTkxNDEsMC44MzA2MiwsCjQ0LDAuNTkwNDIsMC44NzE2MiwsCjQ1LDAuNTU3MjEsMC44NzI0OCwsCjQ2LDAuNDQ2NzksMC44ODczNywsCjQ3LDAuNTA4OTcsMC44NTQwNCwsCjQ4LDAuNDAxOTksMC45MDEwMiwsCjQ5LDAuNDQ0NjksMC44ODA5NSwsCjUwLDAuNDMyOTcsMC44ODQzNywsCg==

It is not a meanfully training loss info file.

here is my complete code after finetuning:
from openai import OpenAI
client = OpenAI()

job_id = ‘ftjob-b9S1AK4BHhBCYJv0I4LBZauM’

job = client.fine_tuning.jobs.retrieve(job_id)

response = client.fine_tuning.jobs.list_events(job_id)

events = response.data

events.reverse()

for event in events:

print(event.message)

result_file = job.result_files[0]

file = client.files.retrieve(result_file)

print(file)

content = client.files.content(file.id)

content.stream_to_file(‘myfilename.txt’)

It has meaning, just not one obvious to those that don’t recognize the pattern:

step,train_loss,train_accuracy,valid_loss,valid_mean_token_accuracy
1,1.06108,0.76159,,
2,0.50092,0.87205,,
3,0.90813,0.76948,,
4,0.70303,0.81073,,
5,0.62207,0.84062,,
6,0.84177,0.80743,,
7,0.60814,0.8483,,
8,0.72083,0.84899,,
9,0.48891,0.87805,,
10,0.72333,0.78,,
11,0.57349,0.84775,,
12,0.5709,0.82812,,
13,0.48119,0.87075,,
14,0.52698,0.84768,,
15,0.5426,0.84084,,
16,0.55149,0.83673,,
17,0.49106,0.87031,,
18,0.54655,0.83276,,
19,0.6118,0.83792,,
20,0.60221,0.83056,,
21,0.51192,0.84731,,
22,0.53612,0.85623,,
23,0.56452,0.82911,,
24,0.65707,0.82085,,
25,0.56787,0.84161,,
26,0.54518,0.84053,,
27,0.68828,0.7987,,
28,0.70758,0.80464,,
29,0.45958,0.88235,,
30,0.50933,0.85443,,
31,0.36938,0.92073,,
32,0.49132,0.84227,,
33,0.46222,0.86486,,
34,0.55585,0.85933,,
35,0.37953,0.89116,,
36,0.42729,0.89404,,
37,0.43066,0.88854,,
38,0.41844,0.89222,,
39,0.56026,0.82667,,
40,0.47285,0.8722,,
41,0.42903,0.89375,,
42,0.3374,0.93603,,
43,0.59141,0.83062,,
44,0.59042,0.87162,,
45,0.55721,0.87248,,
46,0.44679,0.88737,,
47,0.50897,0.85404,,
48,0.40199,0.90102,,
49,0.44469,0.88095,,
50,0.43297,0.88437,,

You have received a base64 response by the method used.

Which can be used to obtain more visual data.

Regular pointed me the correct direction - I was stuck for so long - The output is in a base64 format and you need to decode it

fine_tune_results = client.fine_tuning.jobs.retrieve(fine_tuning_job.id).result_files
result_file = client.files.retrieve(fine_tune_results[0])
content = client.files.content(result_file.id)
import base64
base64.b64decode(content.text.encode(“utf-8”))
To save content to file
with open(“result.csv”, “wb”) as f:
f.write(base64.b64decode(content.text.encode(“utf-8”)))

You can also get the output using GET request:
import requests
my_openai_key = os.environ.get(“OPENAI_API_KEY”)
auth_headers = {
“Authorization”: f"Bearer {my_openai_key}"
}
x = requests.get(‘URL’, headers = auth_headers)
print(x.status_code)

I hope it works for everyone!! Just decode that base64 output!

3 Likes

when was this change implemented? we had built a bunch of things on top of it and all of a sudden things broke

hello i download the file manually from openai and when i wanna read it i got the same output like
Decoded DataFrame:
Empty DataFrame
Columns: [c3RlcCx0cmFpbl9sb3NzLHRyYWluX2FjY3VyYWN5LHZhbGlkX2xvc3MsdmFsaWRfbWVhbl90b2tlbl9hY2N1cmFjeSx0cmFpbl9tZWFuX3Jld2FyZCxmdWxsX3ZhbGlkYXRpb25fbWVhbl9yZXdhcmQKMSwyLjE0NzQxLDAuNTE1MDksLCwsCjIsNC4yNjI0NiwwLjQ1MTYxLCwsLAozLDEuNDM1NTEsMC43Mzg5NywsLCwKNCwxLjkzODI2LDAuNTQyMTQsLCwsCjUsMi43MjA4OSwwLjU3NDA3LCwsLAo2LDIuMDk0ODMsMC40Nzc3NiwsLCwKNywyLjE3ODc2LDAuNDk1MjUsLCwsCjgsNC42NDU1MywwLjUxODUyLCwsLAo5LDEuOTE1MTEsMC41NjEyMiwsLCwKMTAsMi41NTk2NiwwLjUyMTc0LDEuOTUwNzgsMC41ODY2NywsCjExLDIuMDc3MjUsMC40OTkwOSwsLCwKMTIsMi40NTgxNywwLjQ4MTAxLCwsLAoxMywyLjA4MDk5LDAuNTI3NDcsLCwsCjE0LDIuNDg5NzIsMC41MTIyLCwsLAoxNSwyLjE0Mjk5LDAuNTIyNzMsLCwsCjE2LDEuNzg3NzcsMC41MjYzMiwsLCwKMTcsMS42MTE3LDAuNTkyMDksLCwsCjE4LDEuOTc1MDksMC41Mzc5MywsLCwKMTksMi4wNzk4NCwwLjU3MTQzLCwsLAoyMCwxLjcyNjg0LDAuNTc4NjIsMS43NjQxLDAuNTI1NzEsLAoyMSwxLjk3NjEyLDAuNTI0MzEsLCwsCjIyLDEuNjE2NjEsMC42MDg4OSwsLCwKMjMsMS42NjQ4NSwwLjU1LCwsLAoyNCwxLjk2NjE3LDAuNTE2MTMsLCwsCjI1LDEuNjY2MTQsMC41NTA4NSwsLCwKMjYsMS45NDI1NiwwLjU0NjMsLCwsCjI3LDEuNzU0MjIsMC41ODQ0NCwsLCwKMjgsMS45MzI3LDAuNTM1MzMsLCwsCjI5LDIuMDk4NzUsMC40NDY0MywsLCwKMzAsMS43OTM0NSwwLjUxOTIzLDIuMTExNDEsMC40NzMxMiwsCjMxLDIuNjI0MjYsMC40Mjg1NywsLCwKMzIsMS40MjY1LDAuNTc1LCwsLAozMywxLjQ5Nzc2LDAuNjEzNzksLCwsCjM0LDEuOTcyNDEsMC40NjY2NywsLCwKMzUsMS4zNDc1MiwwLjcyOTE3LCwsLAozNiwxLjc5NTczLDAuNTI0NTEsLCwsCjM3LDEuNTM2NSwwLjU3MjAzLCwsLAozOCwxLjM2NDczLDAuNjA5MjQsLCwsCjM5LDEuODE1MywwLjUxMjYxLCwsLAo0MCwxLjU4MTU0LDAuNTg5MjksMi41MjQ4NCwwLjM4NDYyLCwKNDEsMS45NDA0LDAuNTA4OTMsLCwsCjQyLDIuMDUyNzUsMC41MTcxNCwsLCwKNDMsMS4zMTY5NSwwLjY2MDEsLCwsCjQ0LDEuNzMzOTEsMC41NTU1NiwsLCwKNDUsMS42OTk3NywwLjU3NDE2LCwsLAo0NiwxLjc1MjY0LDAuNTUxMjgsLCwsCjQ3LDEuNDQzMjMsMC42MDY0OCwsLCwKNDgsMS43OTQ0MSwwLjU1OTMyLCwsLAo0OSwxLjY3OTMsMC43MTQyOSwsLCwKNTAsMi4wMDIyNiwwLjU4OTc0LDEuNzYyMzEsMC41NTQ1NSwsCjUxLDEuODA4NTIsMC41NTc0LCwsLAo1MiwxLjcwNjM0LDAuNTcyMjgsLCwsCjUzLDEuODQyMDYsMC41NTg4MiwsLCwKNTQsMS40MTkzMSwwLjY1NTc0LCwsLAo1NSwxLjgxMzY4LDAuNDgyNTIsLCwsCjU2LDEuODcwNjEsMC41NDI0NSwsLCwKNTcsMS42MzEwOSwwLjU1Njk2LCwsLAo1OCwxLjU5MDIsMC41OTQ3NywsLCwKNTksMS42MzYwMSwwLjU0NzIsLCwsCjYwLDIuMDk4NjMsMC40OTE3NywyLjQzMzE5LDAuNDQ4MjgsLAo2MSwxLjYzODcyLDAuNTA5NzEsLCwsCjYyLDIuMzY0MzksMC41NTMxOSwsLCwKNjMsMS42OTkxNiwwLjU3OTE3LCwsLAo2NCwxLjkzNjQ1LDAuNTI4OCwsLCwKNjUsMi4wNjEzMSwwLjUxNDUzLCwsLAo2NiwxLjYzNDM0LDAuNTg1MDgsLCwsCjY3LDEuMzg1MzQsMC42Mjg0LCwsLAo2OCwxLjczMTcxLDAuNTUyMjQsLCwsCjY5LDEuNTYzMTEsMC42MjUsLCwsCjcwLDEuNzMxNDksMC41NTc2OSwxLjIzODA0LDAuNjYwODIsLAo3MSwxLjMyNjYyLDAuNjM1MjksLCwsCjcyLDEuNTc4MjMsMC41NjE3LCwsLAo3MywxLjkzMzQ0LDAuNTI5ODUsLCwsCjc0LDEuNTExNjMsMC41ODI4MiwsLCwKNzUsMi40MTUzNywwLjQ2OTcsLCwsCjc2LDEuOTg3MzEsMC40NzkwMiwsLCwKNzcsMS41MjAzNiwwLjU5NjEyLCwsLAo3OCwxLjgwMzE0LDAuNTczMTcsLCwsCjc5LDEuMjEyNzgsMC42Nzc5NywsLC

how can i transfer it readbale and how can i create this plot

Your file contents looks like this:

step,train_loss,train_accuracy,valid_loss,valid_mean_token_accuracy,train_mean_reward,full_validation_mean_reward
1,2.14741,0.51509,,,,
2,4.26246,0.45161,,,,
3,1.43551,0.73897,,,,
4,1.93826,0.54214,,,,
5,2.72089,0.57407,,,,
6,2.09483,0.47776,,,,
7,2.17876,0.49525,,,,
8,4.64553,0.51852,,,,
9,1.91511,0.56122,,,,
10,2.55966,0.52174,1.95078,0.58667,,
11,2.07725,0.49909,,,,
12,2.45817,0.48101,,,,
13,2.08099,0.52747,,,,
14,2.48972,0.5122,,,,
15,2.14299,0.52273,,,,
16,1.78777,0.52632,,,,
17,1.6117,0.59209,,,,
18,1.97509,0.53793,,,,
19,2.07984,0.57143,,,,
20,1.72684,0.57862,1.7641,0.52571,,
21,1.97612,0.52431,,,,
22,1.61661,0.60889,,,,
23,1.66485,0.55,,,,
24,1.96617,0.51613,,,,
25,1.66614,0.55085,,,,
26,1.94256,0.5463,,,,
27,1.75422,0.58444,,,,
28,1.9327,0.53533,,,,
29,2.09875,0.44643,,,,
30,1.79345,0.51923,2.11141,0.47312,,
31,2.62426,0.42857,,,,
32,1.4265,0.575,,,,
33,1.49776,0.61379,,,,
34,1.97241,0.46667,,,,
35,1.34752,0.72917,,,,
36,1.79573,0.52451,,,,
37,1.5365,0.57203,,,,
38,1.36473,0.60924,,,,
39,1.8153,0.51261,,,,
40,1.58154,0.58929,2.52484,0.38462,,
41,1.9404,0.50893,,,,
42,2.05275,0.51714,,,,
43,1.31695,0.6601,,,,
44,1.73391,0.55556,,,,
45,1.69977,0.57416,,,,
46,1.75264,0.55128,,,,
47,1.44323,0.60648,,,,
48,1.79441,0.55932,,,,
49,1.6793,0.71429,,,,
50,2.00226,0.58974,1.76231,0.55455,,
51,1.80852,0.5574,,,,
52,1.70634,0.57228,,,,
53,1.84206,0.55882,,,,
54,1.41931,0.65574,,,,
55,1.81368,0.48252,,,,
56,1.87061,0.54245,,,,
57,1.63109,0.55696,,,,
58,1.5902,0.59477,,,,
59,1.63601,0.5472,,,,
60,2.09863,0.49177,2.43319,0.44828,,
61,1.63872,0.50971,,,,
62,2.36439,0.55319,,,,
63,1.69916,0.57917,,,,
64,1.93645,0.5288,,,,
65,2.06131,0.51453,,,,
66,1.63434,0.58508,,,,
67,1.38534,0.6284,,,,
68,1.73171,0.55224,,,,
69,1.56311,0.625,,,,
70,1.73149,0.55769,1.23804,0.66082,,
71,1.32662,0.63529,,,,
72,1.57823,0.5617,,,,
73,1.93344,0.52985,,,,
74,1.51163,0.58282,,,,
75,2.41537,0.4697,,,,
76,1.98731,0.47902,,,,
77,1.52036,0.59612,,,,
78,1.80314,0.57317,,,,
79,1.21278,0.67797,,,

The longer lines are where validation steps were performed.

That was obtained simply by pasting it into https://www.base64decode.org/

You can also use Python on your local system or in a notebook to work on a downloaded file. Here’s a script to decode a hard-coded file name:

import base64

def decode_base64_file(input_filename, output_filename):
    """Decodes a base64-encoded file and saves the output to a new file.

    Args:
        input_filename: The path to the input file containing base64 data.
        output_filename: The path to the output file where decoded text will be saved.
    """
    try:
        with open(input_filename, 'r') as f_in:
            encoded_data = f_in.read()
    except FileNotFoundError:
        print(f"Error: Input file '{input_filename}' not found.")
        return
    except Exception as e:
        print(f"Error reading input file: {e}")
        return

    try:
        decoded_data = base64.b64decode(encoded_data).decode('utf-8')
    except base64.binascii.Error:
        print("Error: Invalid base64 data in the input file.")
        return
    except Exception as e:
        print(f"Error decoding base64 data: {e}")
        return

    try:
        with open(output_filename, 'w') as f_out:
            f_out.write(decoded_data)
        print(f"Successfully decoded and saved to '{output_filename}'.")
    except Exception as e:
        print(f"Error writing to output file: {e}")

if __name__ == "__main__":
    input_file = "results.txt"
    output_file = "results-output.txt"
    decode_base64_file(input_file, output_file)

Explanation:

  1. Import base64: Imports the necessary library for base64 encoding/decoding.
  2. decode_base64_file function:
    • Takes input_filename and output_filename as arguments.
    • File Reading (with error handling):
      • Uses a try-except block to handle potential FileNotFoundError if the input file doesn’t exist.
      • Uses another except block to catch any other errors during file reading.
      • Reads the entire content of the input file into encoded_data.
    • Base64 Decoding (with error handling):
      • Uses a try-except block to handle base64.binascii.Error which is raised if the data is not valid base64.
      • Uses another except block to catch any other decoding errors.
      • base64.b64decode(encoded_data) decodes the base64 string.
      • .decode('utf-8') converts the decoded bytes to a UTF-8 string (assuming the original text was UTF-8 encoded).
    • File Writing (with error handling):
      • Uses a try-except block to handle potential errors during writing to the output file.
      • Writes the decoded_data to the output file.
      • Prints a success message.
  3. if __name__ == "__main__": block:
    • Ensures the code inside this block is executed only when the script is run directly (not imported as a module).
    • Sets the input and output filenames.
    • Calls the decode_base64_file function to perform the decoding.

how can i trace the graph ?

ChatGPT with code interpreter will be a great help in producing a graphic. It also could do the input file processing for you. You can just ask nicely.

Use your python tool to plot this (file/data) for me, where the rows of increasing “step” are x axis, and the columns of train_loss, train_accuracy, and the occasional valid_loss are plotted as lines on the Y axis.

Note: the CSV data of a file may be base64-encoded, needing decoding first

Ensure that valid_loss has visible points represented on the graph when present, as it is not continuously available in the CSVdata.

The output shall be 1024x768 pixels image.

Note: CSV columns are [step,train_loss,train_accuracy,valid_loss,valid_mean_token_accuracy,train_mean_reward,full_validation_mean_reward]