What are the default rate limits for the File API

alexanderniebuhr · October 14, 2024, 8:42am

I’ve found a lot of information around Usage Tiers and Rate limits for all th different models. But what are the rate limits for the FILE API? Strictly speaking not about vectorizing them, just the upload and delete operations. I don’t seem to find any information on that?

https://platform.openai.com/docs/api-reference/files

_j · October 14, 2024, 9:02am

There is none listed. The limitation will likely be practical upload network bandwidth throughput of the server IP a single client gets than by any imposed rate limit of calls. After you are getting max speed, there is no point in opening more parallel connections.

The API can accept hundreds a second in AI API calls from a client, requiring more computation and more persistent connections than POST multipart/form-data.

Calls that are made to Assistants endpoint methods have ridiculously lower API call limits, though.

Getting more technical, if the files are small:

(after 3000 tokens of what I say to o1-preview being rephrased and spit back at me)…

Maximum Upload Rate:

Without Optimization: Limited to tens of thousands of uploads per minute due to port exhaustion.

With Persistent Connections and HTTP/2:
Potentially Millions of Uploads per Minute:
By minimizing new connections and maximizing the use of each connection.
Socket and port limitations become negligible compared to bandwidth and processing capabilities.

alexanderniebuhr · October 14, 2024, 9:30am

Okay I thought something similar, however I’m still seeing requests failing with a 503 error when I hit it with ~100 request in parallel, it looks like that the FILE APIs are not scaled that well, which makes sense since the File Search is still beta. I wish the SDKs would handle failures more gracefully… And without any known numbers, it’s hard to implement any batching, rate-limit logic…

_j · October 14, 2024, 9:40am

503 comes from Cloudflare. Firewall policies that can be set on the service may be affecting the rate possible, as Cloudflare uses connection trust and is meant to prevent DDoS.

The files endpoint has been in use since fine-tuning was first available for GPT-3.

The Python API SDK client does retry 2 times, an instantiation parameter you can pass, for example if you want to disable it to have a more codeable queue.

Topic		Replies	Views
File upload errors again today API	3	239	February 7, 2025
Maximum Files per Assistant (20) to low API	5	416	February 11, 2025
Request for Information: AI Assistant File Upload Limit Increase Community assistants-api	2	948	December 19, 2023
Lost in Rate Limits - Processing lots of files API rate-limit	6	341	June 26, 2024
Is it indeed correct you can upload more than 20 files in a thread? My testing shows you can API	4	1444	June 25, 2024

What are the default rate limits for the File API

Related topics