Fine-tuning with code snippets

Please advise how to feed fine-tuning model with Python and C++ scripts.

On official site there are samples only the conversational chat and similar use.

I convert .cpp file to JSONL by the following way:

{"code" : "#include <array> #include <cmath> #include <functional> #include <iostream>":"explanation" : "dependencies"}

{"code" : "int main(int argc, char** argv) { if (argc != 2) { std::cerr << \"Usage: \" << argv[0] << \" <robot-hostname>\" << std::endl; return -1; }", "explanation" : "Check whether the required arguments were passed"}

but model doesn’t accept this format.

Please advise solution

Your understanding is very far from being able to train the AI to code. gpt-3.5-turbo is already as good as it gets unless you invest a hundred dollars on thousands and thousands of coding examples of high quality to then improve inference.

You can’t just make up your own format. The chat container as demonstrated with its set roles must be used, the same way as you would use them in practice. You are not just training on random code, you are showing the AI the type of output it should generate for a particular user input.

You get farther faster just using GPT-4 to code, as the AI model of gpt-3.5-turbo is at its core lower architectural specification than the prior gpt-3 models and only has skill because of its millions of fine-tune examples taken from real conversations and evaluated by humans for quality and used for reinforcement learning.

1 Like

Thank you for prompt reply, but ChatGPT-4 generate not workable code based on it’s general knowledge base for our specific tasks to robot manipulations.

This is the reason why we try to train fine-tuned model with specific good working snippets and use this trained model for robot manipulation with much more understanding specific process