Training ChatGPT on a custom programming language/ file structure

Hi there,
We are a company that basically creates documents. These documents are written/ saved in a custom file structure created by us that is similar to xml/xsl.
And now the aim of my new project is to train ChatGPT (or other LLM in the future) to understand the documents and to be a able to write some (itx, that is how we call the files) itx code itself.
For example after a prompt of provide the itx code necessary to have a table with three columns and two rows.

Within the company we have been creating these documents for more than ten years within a own software. Thus we have millions of documents that can be provided as training data. Furthermore we can provide not only the itx, but also the exported pdf, docx, xsl-fo and html, so lots of other formats that GPT already understands for it to be able to create a connection between the itx and what the equivalent content looks like.

So to summarise my questions would be:

  1. How do I train it best?
  2. How do I prepare the data?
  3. And is something like this even possible :smile: ?

FYI: I have more than enough time and all the resources needed :slight_smile:

Hello! I think I may have the answer for 2 questions.

  1. Training
  • First off, training a ChatGPT model would just be giving it a prompt on what you would like it to do for the end user.
  1. Possibilities
  • Yes, you can create a prompt and format document information in a way ChatGPT would understand.

This is all I have right now, reporting back soon!