GPT 4o mini performing much worse than GPT-3.5-16k

I developed an in house proprietary program for an agency that involves the use of ChatGPT API calls to generate XeLaTeX code snippets. Because of the large token numbers needed in singular requests, and consecutively as well, we opted to use the 3.5 turbo 16k context. Other than a few glitches, largely correctable with procedural programming techniques to mitigate syntax errors, this was working for us.

Now we have been told to switch to the 4o mini model, and we did for several days. But we had to switch back for now because the output was RIFE with far more syntax errors than we experienced with the XeLaTeX output of the 3.5 16k model.

Why is this the case? I can produce exact examples generated from the model. These kinds of syntax errors are more than just typos or the typical kind of error you’d see a human do, it’s actually like the model was retrained incorrectly.

We are alarmed and with the impending deadline looming for the full disconnection of the gpt 3.5 16k model coming up in September, we are worried that we can no longer provide OpenAI’s platform as a solution for our clients beyond that time.

OpenAI, for your reference, our software has earned you over $5,000 in token usage - at least. The fact that there is no interface for technical support is beyond me.