Training ChatGPT on a custom programming language/ file structure

ahmed.abd.el-ghany · April 16, 2024, 2:40pm

Hi there,
We are a company that basically creates documents. These documents are written/ saved in a custom file structure created by us that is similar to xml/xsl.
And now the aim of my new project is to train ChatGPT (or other LLM in the future) to understand the documents and to be a able to write some (itx, that is how we call the files) itx code itself.
For example after a prompt of provide the itx code necessary to have a table with three columns and two rows.

Within the company we have been creating these documents for more than ten years within a own software. Thus we have millions of documents that can be provided as training data. Furthermore we can provide not only the itx, but also the exported pdf, docx, xsl-fo and html, so lots of other formats that GPT already understands for it to be able to create a connection between the itx and what the equivalent content looks like.

So to summarise my questions would be:

How do I train it best?
How do I prepare the data?
And is something like this even possible ?

FYI: I have more than enough time and all the resources needed

justaneric · April 17, 2024, 2:57am

Hello! I think I may have the answer for 2 questions.

Training

First off, training a ChatGPT model would just be giving it a prompt on what you would like it to do for the end user.

Possibilities

Yes, you can create a prompt and format document information in a way ChatGPT would understand.

This is all I have right now, reporting back soon!

Topic		Replies	Views
Training GPT to learn new scripting language API	1	665	December 15, 2023
Training a Custom GPT Model for Call Transcripts Analysis GPT builders chatgpt	2	1621	December 12, 2023
Fine-tuning GPT to learn a new coding language Prompting codex , chatgpt , plugin-development , fine-tuning , api	3	2554	December 24, 2023
Custom GPT Model Training for Unversity LMS Courses (E-Learning) API chatgpt , api	1	420	February 23, 2024
How to train the API using like 100 documents (docx, xlsx, pptx, pdf) API	3	284	April 7, 2024

Training ChatGPT on a custom programming language/ file structure

Related Topics