Finetuning needs to be cheaper

daveshapautomator · November 29, 2022, 8:27pm

This message is primarily for OpenAI folks

Finetuning is absolutely critical. Please reduce the cost of finetuning soon. I know many people haven’t the first clue about how to do finetuning, but mark my words, finetuning will be 100% necessary for every AI company.

I know that INSTRUCT is awesome and you can do many things with it, but those of us at the cutting edge find plenty of limitations.

The fact of the matter is that prompt engineering and finetuning are two financially different approaches. If you need to split them up into two products, fine.

smith1302 · November 30, 2022, 5:50pm

Completely agree! The price difference from the base models to fine-tuned is quite steep (even more so with the recent price drop in base models). Finetuning would greatly improve the outputs at our company but we can’t currently justify the price difference so we are sticking with basic prompt engineering for now. Hopefully its something they are working on improving.

nunodonato · November 30, 2022, 7:29pm

I have to +1 this as well. Have plenty of plans for fine-tunning but so far the cost has kept me away.

daveshapautomator · November 30, 2022, 11:21pm

I thought it would be helpful to add some context about why I personally find finetuning to be exceptionally useful (above and beyond INSTRUCT)

INSTRUCT and FINETUNE are different technologies with different usecases
Finetuning allows for training to handle adversarial attacks (deliberate or accidental user exploits)
Finetuning allows for specific formats and structures to be followed
Finetuning allows you to build in error handling and edge cases without having to use multiple steps and chains of prompts
Finetuning allows you to perform multiple cognitive steps in one go, also saving prompt chaining
There are other unique things that cannot be done in single prompt chains, or cognitive tasks that are too complex to convey in instructions

Finetuning is a critical part of this technology and I’m starting to get the sense that OpenAI is deliberately underinvesting in this domain due to the price drop of foundation models (INSTRUCT) and not dropping the price of finetuning. This is a huge business mistake IMHO, on the same scale as hamstringing DALLE (notice how all the open source models are improving far faster than DALLE?).

Yes, you can rapidly prototype and test ideas with INSTRUCT models, but that’s like using LEGO bricks when you could be using a TIG welder to build things. Finetuning is for us big kids

daveshapautomator · December 1, 2022, 2:03pm

Another reason: explainability. We can’t rely on a total black box trained by another team in secret. For medical, legal, and other high stakes use cases, we will need to be able to point to our own proprietary training data in order to explain why our models made certain decisions.

jonathan.sabbah · December 4, 2022, 5:23pm

IMHO I think that we should be able to fine tune the last models and not only the original davinci, even if the text-davinci-003 is solid on few shots tries, I can’t stop thinking that having this super powerful new model fine-tuned would be extraordinary

dusandev87 · December 4, 2022, 6:00pm

+1 for cheaper fine-tuning

daveshapautomator · December 4, 2022, 7:47pm

If I’m not mistaken you can finetune existing models. I thought they had a press release about that. Or certainly people have talked about it on this forum

tcoffman · December 5, 2022, 2:51pm

Totally agree, Dave. I have a half dozen use cases for finetuning, but it’s cost prohibitive.

jonathan.sabbah · December 5, 2022, 8:04pm

You can fine tune old Davinci but not InstructGPT
There is a confirmation here from openai staff :

gpriday · December 6, 2022, 12:14pm

The 6x lower cost of the base models also makes it justifiable to generate a few completions for each prompt, then using a discriminator to pick the best one.

The way I see it, the main task with base models is prompt engineering and discriminator training. With fine-tuning, it’s gathering training data.

Since the price drop on the base models, I spend most of my time building discriminators. Just worried I’m wasting effort if there’s a price drop coming for fine-tuned models at some point too. If we go back to fine-tuned models only being 2x more than base models (I think that’s what it used to be), I’d move everything back to fine-tuned models - which I prefer for most tasks.

SteveS · December 15, 2022, 6:25pm

Agree that fine-tuning is essential for OpenAI to:

o Build an Ecosystem. Build and retaining an ecosystem of vertical integrators that otherwise will build on competitors or open source

o Deliver a Moat. Give these third-parties the ability to build businesses around solutions that have a significant moat (i.e. that can’t be trivially replicated by endless competitors)

o Meet Expectations. Improve outcomes in verticals to move beyond dancing bearware deliver results that reliably meet a paid customer’s expectation

That said, I can see a number of reasons why fine-tuning would not be OpenAI’s priority just now, because of:

o Bigger Fish. The horizontal applications DALL-E and GPT (e.g. building DALL-E into Microsoft Designer and Bing, maybe building Chat GPT into Word, Codex, etc.) are a faster route to revenue and a quicker return on Microsoft’s investment in OpenAI than third-party verticals right now.

o Immediate Obsolescence. Rumor has it that GPT4 will be trained on 500x more data than GPT3, likely making many fine-tuning efforts on GPT3 obsolete in short order. I’ve found some few shot prompts on text-davinci-003 already outdo a model with hundreds of fine-tuned examples on text-davinci-002 in some respects.

o Bad Outcomes in Key Verticals. Regardless of how well you tune GPT3, you can’t overcome the failing that GPT3 is basically guessing at an answer, it doesn’t know when it’s wrong, and in sophisticated topics, its errors can be hard for even a trained professional to spot. As such, it isn’t likely you can tune GPT3 to produce reliable enough results in high value verticals (e.g. medical advice, legal advice) to reduce human effort in those fields. The technology needs improve before it’s viable (e.g. it needs to be able to express percentage confidence in portions of results, to limit inputs to valid bodies of text, to trace back to likely sources of its output, etc.)

I hope I’m wrong, as being able to fine-tune the latest model would be a fantastic advance.

Rariny · March 1, 2023, 6:10am

I totally agree to this post. After days of experience I have realized that fine-tune price is so high, especially when you need to refine-tune an existing fine-tuned model. You have to pay for everything again.

Topic		Replies	Views
Fine-tuning Announcement & Davinci Beta Announcements	8	1521	January 3, 2024
Fine tuning - how exactly does it work? API	6	1636	December 23, 2023
ChatGPT fine-tuning as a service API	17	12734	December 13, 2023
GPT-3.5 Turbo fine-tuning now available (and new GPT3 models) API announcement , fine-tuning , api	18	10430	December 15, 2023
Fine-tuning only available for 'base models'? API	6	1043	December 23, 2023

Finetuning needs to be cheaper

Related Topics