Training GPT to generate code in a private DSL

avinkavish · May 18, 2023, 9:11am

I have a closed-source project where we use a DSL to describe certain data structures. The DSL is similar to dockerfile or nginx.conf. I’d like GPT to generate code in it when prompted. Since the syntax is not publicly available, I can’t just ask it to. When given an example, it does match the pattern and produce output with a similar structure, but it doesn’t generalize it to the entire syntax. I could give examples to it to cover the entire syntax and/or the documentation to it, but it doesn’t fit in the model’s token limit. Maybe it will fit in GPT-4’s limits, but that’s too expensive and impractical for the scaling needs of the application.

So can I achieve this by fine-tuning? I can provide 1000s of examples and a full documentation on the syntax, if that works. But I’m not sure how to structure the prompt-examples in that case.

Topic		Replies	Views
Best way to get GPT to output a DSL API api	0	870	December 8, 2023
Training GPT to learn new scripting language API	1	1354	December 15, 2023
What is the best way to teach a GPT model a new scripting language? Community gpt-4 , fine-tuning , chatgpt-plugin , functions	5	3012	December 24, 2023
Fine-tuning a model without using prompt-completion API fine-tuning	1	911	July 4, 2023
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	3487	December 24, 2023

Training GPT to generate code in a private DSL

Related topics