Using the vector store will easily outrun the rate limits for any model.
I just played a couple of requests in the playground and hit the rate limits several times. This is obviously due to the 20 relevant chunks (at about 800 chars) for each search in the vector store. But this is (at minimum) what you want from a search in large text corpora.
So the vector store seemed like a nice addition since you don’t have to set up a database and do the embeddings yourself. But it is simply not useful when you hit a rate limit every other time.