Headers for rate limits for vanilla models and finetune models

Dev19432 · May 6, 2024, 8:48am

Hello, To monitor my limits and control the speed at which I send my requests I use the headers found in the responses. When I send requests with non-fine tuned models such as GPT-4-Turbo-Preview, I receive the headers. However, this is no longer the case with fine tuned models. Do any of you know the cause of the problem and what I could do? I use the client.chat.completions.with_raw_response function.

jsullivan · May 6, 2024, 5:19pm

They used to include the x-ratelimit-* headers in fine-tuned models until Thursday May 2nd, when this feature seems to have broken. I suspect this is a bug that OpenAI introduced with a release last week.

Topic		Replies	Views
X-ratelimit Headers Missing Bugs	3	94	September 18, 2024
Rate Limit response headers not being sent API dalle3	2	197	November 29, 2024
Possible to check API rate limit headers without burning a request? Feedback api , gpt-4-vision	5	3411	May 28, 2024
Assistants V2 API is not returning any Rate Limit headers Bugs	0	23	December 12, 2024
Rate Limits for preview models? API gpt-4	11	3556	March 11, 2024

Headers for rate limits for vanilla models and finetune models

Related topics