Price for Fine Tuning with gpt-3.5 turbo and gpt-4

chat4896 · December 26, 2023, 4:14pm

How to estimate the price of fine tuning with gpt 3.5-turbo 1106?
What’s the price for fine tuning with gpt-4?

TonyAIChamp · December 27, 2023, 12:48am

Welcome to the forum.

I believe you meant fine-tuning something (gpt-3.5/4), not with something.

If this is the case, you can see the price on the OpenAI pricing page for 3.5. GPT-4 you cannot fine tune yet.

_j · December 27, 2023, 3:36am

Model	Training	Input usage	Output usage
gpt-3.5-turbo	$0.0080 / 1K tokens	$0.0030 / 1K tokens	$0.0060 / 1K tokens

fine-tune accepts a training file, with many example conversations showing the kind of new response the AI should produce for that style of input.

Each message within an example conversation and final completion, such as [system, user, assistant] has language the AI model receives, encoded by a BPE tokenizer. An exact calculation of the contents of the JSON lines can be made by tiktoken, which is a library that encodes language the same way as the cost is calculated.

Additionally, the size of the file in total tokens is multiplied by the n_epochs parameter, which is the number of reinforcement learning passes performed on the training file. Expect about 8 epochs on a small file if you leave the epochs parameter unspecified to be auto-configured by OpenAI.

That is the “training” cost above.

Using the model also costs considerably more than the base, but for particular applications, hopefully you can improve the quality beyond that even offered by more expensive models and more prompting.

erick.ribeiro · March 4, 2024, 8:39pm

You can estimate the cost using the following formula:

Or, using this source code in python:

def cost_estimation(training_file_name: str, model: str='gpt-3.5-turbo-0125') -> None:
    def num_tokens_from_messages(messages: list) -> int:
        tokens_per_message = 3
        tokens_per_name= 1

        num_tokens = 0
        for message in messages:
            num_tokens += tokens_per_message
            for key, value in message.items():
                num_tokens += len(encoding.encode(value))
                if key == "name":
                    num_tokens += tokens_per_name
        num_tokens += 3
        return num_tokens

    encoding = tiktoken.encoding_for_model(model)
    convo_lens = list()
    dataset = list()
    n_messages = list()

    with open(training_file_name, mode='r', encoding='utf-8') as f:
        for line in f:
            dataset.append(json.loads(line))        

    print("Num examples:", len(dataset))

    for ex in dataset:
        messages = ex["messages"]    
        n_messages.append(len(messages))
        convo_lens.append(num_tokens_from_messages(messages))    

    n_epochs = TARGET_EPOCHS
    n_train_examples = len(dataset)
    
    if n_train_examples * TARGET_EPOCHS < MIN_TARGET_EXAMPLES:
        n_epochs = min(MAX_DEFAULT_EPOCHS, MIN_TARGET_EXAMPLES // n_train_examples)
    elif n_train_examples * TARGET_EPOCHS > MAX_TARGET_EXAMPLES:
        n_epochs = max(MIN_DEFAULT_EPOCHS, MAX_TARGET_EXAMPLES // n_train_examples)

    print("Num epochs:", n_epochs)
    n_billing_tokens_in_dataset = sum(length for length in convo_lens)
    final_cost = (BASE_COST_PER_1K_TOKENS * n_billing_tokens_in_dataset * n_epochs ) / 1000 

    print(f"Dataset has ~{n_billing_tokens_in_dataset} tokens that will be charged for during training")
    print(f"By default, you'll train for {n_epochs} epochs on this dataset")
    print(f"By default, you'll be charged for ~{n_epochs * n_billing_tokens_in_dataset} tokens")        
    print(f"Final cost will be  $ {final_cost}")

cost_estimation("/tmp/training.jsonl")

Topic		Replies	Views
What are the costs of fine-tuning API gpt-4 , fine-tuning , api	4	6186	December 24, 2023
GPT-4 Fine Tuning Pricing API gpt-4 , fine-tuning , api	0	2892	November 7, 2023
Cost of Fine Tuned Model Usage API fine-tuning	1	6125	October 23, 2023
What is the price of fine-tuning and inference tokens with GPT-4o API gpt-4o	0	104	July 22, 2024
Cost update explanation for fine tuning models API fine-tuning , pricing	2	4631	May 7, 2024

Price for Fine Tuning with gpt-3.5 turbo and gpt-4

Related topics