Performing a task in bulk, turns mad at times

I have a clear prompt, with a given schema. That has been generated by ChatGPT. Why can that idiot AI not stick to the prompt ? i have a list of 400000 words, I process them in bulk of 30 to stick within the maximum number of tokens, and some times that thing is just turning crazy.
Isn’t there a way to tell OpenAI that this is exactly what I want, and it sticks to the correct case.
The issue is that I want to skip words that are not a base word. For instance : For instance CATS, should not b e generated as the base word is CAT, same for verbs. But sometimes it get a conjugation as a base word, and skips the base word. It works perfectly well in 70% of the API calls, but for no reason it just turns crazy at times.
I had the same issue with a translation API… 97% of the results are translated in the right language, and for no reason I have translations coming out in chinese, spanish, or italian.