Fine-tuning model with JSON-schema

Hey everyone,

I’m planning to extract structured data from a collection of free-text records—nearly 50,000 entries, each averaging around 200 tokens. I’ve created a comprehensive JSON schema with about 1,300 lines, and tested it in the Assistant mode within the playground environment. However, I’ve noticed that the GPT-4o-mini by itself doesn’t provide the level of accuracy and consistency I require and seemingly requires some finetuning.

So I prepared 100 examples according to the JSON-schema to fine-tune the model with, but I don’t know how to incorporate the JSON schema in the JSONL structure for both the training phase and the actual data extraction calls.
In the OpenAI cookbook, section “Introduction to Structured Outputs” the schema is included the in every single example. This, besides being redundant, would skyrocket the cost of training and running the model.

Any suggestions to tackle this problem efficiently?

Thanks in advance