Can we train with XML or JSON?

pritamrose · August 10, 2023, 4:29am

Hello All,
I am a beginner in openAI and chatGPT. I am trying to train model with non-text data. Please help me clarify below doubts.

I have a use case where I want to train model with XML/JSON. I will be having more than 100 XML/JSON file with various use cases. Is it possible to train model with XML/JSON. If yes how will I train? Do I need to provide unique title in each files. Is the title plays important role in training the dataset. Please clarify.

My 2nd use case is:
I want to join 2 XML/JSON. e.g. I have a XML/JSON to retrieve data and another XML/JSON to filter this data. In Prompt, If I ask model to retrieve data and filter it, will it be able to join this 2 XML/JSON?

One more clarification:
If my file is large, will openAI able to respond with the entire content. As far as I know size of token in chatGPT3.5 is 4000 which is like 3000 words. If the response is larger than this size, how the response will be.

Foxalabs · August 10, 2023, 8:13am

Hi and welcome to the developer forum!

Can you give some example of the data you intend to train on, and also of the tasks you need the model to perform?

Also, as of right now the largest context size model that can be trained is 4000 tokens, there are plans by the end of the year to have a trainable 8k model and possibly 16k and 32k but those last two I am not sure about.

Topic		Replies	Views
How do you train with non-text data? API	1	1755	November 25, 2021
Training GPT assistant using JSON API gpt-4 , chatgpt , fine-tuning , api , assistants-api	4	866	September 13, 2024
Answering questions about text file content API	5	9182	December 15, 2023
Creating JSONl File from doc file API chatgpt	3	3462	October 10, 2024
How can I use chat/completion API on large datasets of "arbitrary" JSON API gpt-4 , fine-tuning , token , json	7	2883	March 12, 2024

Can we train with XML or JSON?

Related topics