I am looking to fine-tune GPT-3.5 for a specific application involving code generation. However, due to the specific nature of the application, there is a high likelihood of errors in the generated code. To overcome this, I plan on fine-tuning the model using specific documentation and programming books that utilize natural language. I am having difficulty formatting the data in the required format of “prompt” and “completion” pairs. Can you provide guidance on how to properly fine-tune the model using this type of data?
Yeah, i too would like to know how to just feed it data, instead of the (question/optimal answer format) …there is no optimal answer if i just want it to read a Wikipedia article
Any luck on figuring out how this is done yet?
What about splitting the process into two calls to the API?
1st divide the data into small fragments that could be added within the messages parameter and make a list of titles for every fragment. Then make a 1st call to the API asking which of the titles fit better to the prompt.
The next step would be to get the number in the response (maybe it’s not only a number, but it should contain only a number) and send only the specific fragment that corresponds to that number as assistant data together with the prompt of the user. The answer should be related to that fragment
For example:
I have a big data of rules from an online game and I wanted ChatGPT to answer only questions about the rules. But they were too many tokens, so I divided them into chapters and made this question to ChatGPT:
Given a list of topics:
1. Register and basic rules
2. Team
3. Players
4. Training
5. Junior School
6. League system
7. National Teams
Tell me the number and only the number of the topic that better fits this prompt: “what should I do to score more goals?”
The answer was: 4
What do you think?
One time I sent a message to chatGPT like: how can I fine tune gpt3 using transfer learning, and it gave me the response. But I’m not sure if I can use a website article to fine tune it on by this way
I would also like to know what is the engineering behind this problem.
Hello, I managed to use this idea but I did it using a .csv (with data from a microcompany), the idea is the same, I pass the file and it summarizes the file very detailed with the prompt set, however, I want to change the model to gpt3.5-turbo but I’m not getting any success
How did you pass the file to the API?
Hi, you might have found the solution to the issue when I replied to this, but I would like to reply to this question with my knowledge to help others who have a similar question.
There is a framework called LlamaIndex 🦙 0.9.13. That is a RAG framework which can integrate real data such book and documentation to GPT model.