Why are my output_tokens increasing with the index of the submission, when there is no such pattern in input size?

Hi everyone,

I am encountering a strange issue when using Python for batch submissions (Through the Azure API). I’ve noticed a cyclical pattern where the number of output tokens seems to correlate with the index of the text, rather than the input size.

In python, I create a list of dictionaries which then gets converted to jsonl. I then submit this to the client. My intention is that each note is completely independent of the rest.

The Issue: Specifically, the number of output tokens steadily increases as the index of the text increases. This happens for around 100 notes, at which point the token count sharply drops, and the pattern repeats. (when plotted this looks like the blade of a saw).

Troubleshooting: I initially thought this might be due to how my data was sorted, but I have confirmed that the pattern persists even after randomly shuffling the list of texts before submission. I also confirmed that there is no bug in my creation of the list of dictionaries. The issue is not due to the size of the inputs per text having a saw pattern.

Has anyone seen this behavior before or have any ideas on what might be causing this “reset” every 100 requests?