Fine-tuning with 3.5 turbo or gpt4

I have to develop a chatbot with custom-knowledge, i saw that it is possible to fine-tune only with basic models, is there a way to train a model with my data, using gpt4 or 3.5-turbo?

There’s not any way I know of. With enough data, you should be able to get curie or davinci fined-tuned to equal or higher quality than 3.5-turbo, though. 3.5-turbo itself feels like a very heavily fine tuned version of one of the other standard models, but I’m not sure of this.

ok ok, so if I load a large amount of data, does da vinci perform as well as gpt3.5 turbo?

Davinci should actually do better than 3.5. It might be expensive and slow to train with a large amount of data. Fine tuned curie should perform similarly well to untrained 3.5 for some use cases.

Actually, the price seems really expensive now that I look at it, as they’ve never lowered the price for fine tuning since it was released. Fine tuned curie was once the benchmark for cheap AI, but usage is like 6x the price of 3.5, while davinci is 60x.

I’d normally prefer to “train” it by giving 5x examples in a single prompt, but if you have a very large data set, you should be able to get better quality from fine tuning.

1 Like

Generally its recommended to use embeddings/vector DB’s for this rather than fine-tuning. And with that strategy you can use GPT3.5/4. See LangChain project for some examples to get started

2 Likes

I would go with embedding’s for custom knowledge as you can more easily update it and it can work out cheaper overall. Use the cheaper models to pull back good matches and then feed them into GPT3/4 for a better overall answer.

Sorry guys, I just started studying the documentation, I didn’t understand how embeddings work for custom knowledge, could you give me a simple example?

https://python.langchain.com/en/latest/use_cases/question_answering.html

3 Likes

Look up James Briggs on YouTube and he has a series of videos on end-to-end encoding with sample code you can download / colabs you can use.

In simple terms, you break your core text up into chunks and send each one to openAI’s embedding model which will give it a vector representation of the meaning. You can then store these locally or in a cloud service like pinecone

then, when you want to ask a question, you take the question text and put it through the same encoding and then compare the vector you have with the ones you stored. Pinecone and OpenAI each have functions that will do this for you and you end up with an ordered list of the most relevant matches.

Take the top X of these matches which will contain the blocks of text most likely to contain the answer and then feed them into a prompt saying something like:

Using the following text, answer this question…

All that’s the technical How To but if you use a tool like langchain as @novaphil has linked to, it takes out all the hard work and does this for you…

Hope this helps and good luck!

1 Like