Teaching GPT a new/niche programming language

kylew · June 2, 2023, 4:42pm

I work for a software company and we have our own application-specific programming language. I want to train a model to answer coding questions in the same way ChatGPT can respond to Python questions: giving one-line responses or even writing entire functions that accomplish a task.

I have a large set of help documentation. Each function or class in our language has its own page, which describes the arguments and also provides code examples e.g.:

# Open a table
tbl = CreateObject("Table", file_name)

I see a lot of conflicting opinions on whether to use embeddings or fine-tunings (I also see that codex fine-tuning is no longer possible).

I have worked through OpenAIs python notebook on Q&A using embeddings. I’ve also played with fine-tuning and have a basic understanding of creating prompt-completion pairs and using the API to do training.

What is the right approach? Should I fine-tune the base davinci model? Should I build an embeddings database and then prepend relevant chunks into my prompt? Will either approach be successful?

Thank you!

PaulBellow · June 2, 2023, 5:20pm

Welcome to the community.

Since you can only fine-tune original Davinci, I would stick with embeddings… or at least test that method first then try a small fine-tune to compare results. I really think embeddings should do the trick, though.

Topic		Replies	Views
Training GPT to learn new scripting language API	1	1373	December 15, 2023
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	3510	December 24, 2023
ChatGPT 3.5's fine-tuning or embeddings or both? API embeddings , fine-tuning	5	6012	August 25, 2023
How to fine tune so GPT knows a new API and then how to prompt to use that API Prompting	4	1457	March 29, 2023
What's better for the type of chatbot I am building? Fine tune or embedding? Community chatgpt , api	10	2240	August 20, 2023

Teaching GPT a new/niche programming language

Related topics