Agree that fine-tuning is essential for OpenAI to:
o Build an Ecosystem. Build and retaining an ecosystem of vertical integrators that otherwise will build on competitors or open source
o Deliver a Moat. Give these third-parties the ability to build businesses around solutions that have a significant moat (i.e. that can’t be trivially replicated by endless competitors)
o Meet Expectations. Improve outcomes in verticals to move beyond dancing bearware deliver results that reliably meet a paid customer’s expectation
That said, I can see a number of reasons why fine-tuning would not be OpenAI’s priority just now, because of:
o Bigger Fish. The horizontal applications DALL-E and GPT (e.g. building DALL-E into Microsoft Designer and Bing, maybe building Chat GPT into Word, Codex, etc.) are a faster route to revenue and a quicker return on Microsoft’s investment in OpenAI than third-party verticals right now.
o Immediate Obsolescence. Rumor has it that GPT4 will be trained on 500x more data than GPT3, likely making many fine-tuning efforts on GPT3 obsolete in short order. I’ve found some few shot prompts on text-davinci-003 already outdo a model with hundreds of fine-tuned examples on text-davinci-002 in some respects.
o Bad Outcomes in Key Verticals. Regardless of how well you tune GPT3, you can’t overcome the failing that GPT3 is basically guessing at an answer, it doesn’t know when it’s wrong, and in sophisticated topics, its errors can be hard for even a trained professional to spot. As such, it isn’t likely you can tune GPT3 to produce reliable enough results in high value verticals (e.g. medical advice, legal advice) to reduce human effort in those fields. The technology needs improve before it’s viable (e.g. it needs to be able to express percentage confidence in portions of results, to limit inputs to valid bodies of text, to trace back to likely sources of its output, etc.)
I hope I’m wrong, as being able to fine-tune the latest model would be a fantastic advance.