If you hit a rate limit on a single request, you are sending far far too much data. If you only exceed the model limit of 4k (or 16k for gpt-3.5-turbo-16k) but not going over the per-minute rate limit near 100000 tokens, you instead get a context length error so we know the data sent must have been huge.
The model can only accept and understand a small amount of data at once. About 2000 words, measured as tokens. You must find a way to use your data in small batches.
If you are using an account with monthly billing, but have never been billed because you barely use the service, that explains why you don’t have access to gpt-4 yet. You must be a “paying customer”.
Here’s a dangerous script to quickly spend $1 on API usage on your account before the end of the month. Then when you are billed to your payment method in a few weeks, you should shortly after be granted access.