Hello all,
I am running into a situation that I would love to troubleshoot, I have found many similar posts about this topic, but I have struggled to be able to apply them to my situation so thank you in advance for any help you can give me!
Note: I am calling gpt 4.1 mini, and the company I work for has an RPM of 30,000 tokens
I am currently using the openai api in the following way: I have a bunch of .txt files that I read in, insert the .txt file content into a prompt, and send to chatgpt one at a time to get a structured output. The .txt files range in size but generally are not very big at all, in terms of tokens, the bigger files combined with the prompt contain a max ~6000 tokens, but some are as small as ~3000. Most of the time this works fine, however after running a decent amount of files I started to run into the following error somewhat randomly:
Could not parse response content as the length limit was reached - CompletionUsage(completion_tokens=32768, prompt_tokens=5141, total_tokens=37909, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=0, audio_tokens=0, reasoning_tokens=0, rejected_prediction_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=2048))
One thing to note is that In my code, I have it so if this error pops up the code sleeps for 30 seconds, then tries that file again. I have it so it will re-run the same file a max 5 times before it moves on. Sometimes re-running that file will fix the issue and the file will be processed but other times it will try the 5 times and still fail to return a valid output. For example, one of the times I was trying to recreate this error with the same file that caused the above CompletionUsage error ran successfully after 3 attempts and had the following usage information, but this doesnât happen every time:
completion.usage {âcompletion_tokensâ: 916, âprompt_tokensâ: 5141, âtotal_tokensâ: 6057, âcompletion_tokens_detailsâ: {âaccepted_prediction_tokensâ: 0, âaudio_tokensâ: 0, âreasoning_tokensâ: 0, ârejected_prediction_tokensâ: 0}, âprompt_tokens_detailsâ: {âaudio_tokensâ: 0, âcached_tokensâ: 5120}}
What is also strange about this is that, as you can see, the completion tokens are a crazy amount less when the same file runs successfully. I tried to use max_completion_tokens and set that to 23000 but the same error occurred, just now at the new max_tokens. This error is both consistent and inconsistent as the same files will throw the CompletionUsage error, however sometimes those files wonât cause any issues at all. And then again, even if they throw the error, re-running it sometimes fixes it and other times doesnât. From what I can tell, there is nothing strange or different about the files that are sometimes causing these issues vs the ones that donât; and like I said these files are not big so I have looked and compared the files that do this/donât do this.
What I was really hoping to find was a possible way to have openai still return an output of some kind when this error occurs so I can see what their output looks like because clearly there is a problem. Without that information I feel lost but as of now I havenât been able to find a way to do that. If that is not possible I was wondering if anyone has any suggestions as to what they think could be happening and/or suggestions on how to troubleshoot.
edit: I completely forgot to add, I also want to know if anyone can tell me whether or not we are still getting charged for the tokens when the CompletionUsage error occurs!