I’ve just been accepted for access to the GPT-4 API.
Most of my usage for GPT (and I assume for a lot of people) has been through fine-tuning text generation. I tried to fine-tune an existing dataset using the modal “GPT-4”, but it was unsuccessful. I assume that you can only fine-tune with GPT-3 at the moment.
Are there any rumours/estimations of when we can expect GPT-4 fine-tuning?
Thanks for accepting me into this beta. I can’t wait to spend the next few days playing around!
I tried using the latest version of the CLI to try to fine-tune:
openai api fine_tunes.create -t "prompt_prepared.jsonl" -m gpt-4
The response I got was:
[organization=rapidtags] Error: Invalid base model: gpt-4 (model must be one of ada, babbage, curie, davinci) or a fine-tuned model created by your organization: org-XXXXXXXXXXXX8eQRw (HTTP status code: 400).
Yeah I will be surprise if GPT-4 can be fine-tuned any time soon (read a few weeks). Even davinci-text-003 couldn’t be fine tuned until a few weeks ago. It will be super awesome though if they skip the models in between (GPT-3.5 etc.) and allow fine-tuning on GPT-4 asap.
We are interested in building a chat application that utilizes a chatbot to provide law-related information to users. Our plan is to train the chatbot using GPT-4 with our custom data. As our laws data will be continuously changing, we are seeking advice on the best solution to develop this chat application.
We have some questions regarding our problem solution:
Would fine-tuning be suitable for our needs?
If we choose to use fine-tuning, how can we manage our files and models? We are concerned that each time we update our data or model, OpenAI’s API will create new files and models. This could result in an unmanageable number of files and models. Ideally, we would like to use OpenAI’s available model (e.g., Devenci) and update our existing model without creating additional models. However, we have not found any relevant documentation or suggestions for this issue. Can you please provide any relevant documentation or suggestions?
We would appreciate your advice on how to tackle these issues and build our chat application. Thank you for your help.
I think it depends on your needs.
Basically, fine-tuning is for “Analogical Reasoning”, while embeddings is like “finding a needle in a haystack”.
If you want the chatbot to reply within your custom law-related information, then embeddings should fit you better.
But if you want the chatbot to learn from your information and reply “beyond the scope”, then you should use fine-tuning.
Since you’re in legal industry rather than creative industry, I guess embeddings should be the correct solution.
Every time you fine-tune a model (those provided by OpenAI, or your own previously fine tuned model) a new model and its associated files get created and they becomes available for you. This makes sense as you might want to do version control of your models and also what happens if fine tuning a model adversely impacts the performance of the model you are fine tuning? This way you have those previous models available to fall back on. I don’t know of a way to fine tune an OpenAI model (the ones which are only available through API) without creating a new one.
From the little I understand about your application, I would imagine fine tuning will assist you. However there might be other open source models which are better suited to your application. In addition to other considerations, it depends on how deterministic you want your answers to be. I will encourage you to look at the documentation of how to train GPT-2 and equivalent models available through hugging face.
Recommend having a look around Hugging Face. Maybe search for “law” and see what existing models there are. Going down the path of opensource models will involve a significant learning curve.
Hello everyone, I am working on producing step-wise solutions for high school mathematics using AI. Can anyone suggest me any possible methods? I don’t think gpt4 is being allowed for fine-tuning. Are there any open-source models or existing works which do the same? Please let me know if you have any ideas
Hey everyone,
I’m looking to build a fine-tuned model with a .jsonl file using GPT-4. Is that doable? I’ve already built a fine-tuned model using GPT-3, but I’d like to switch to GPT-4 instead. Can I feed the context into the fine-tuned model? Looking forward to your input