Different prompt tokens betwen OpenAI tokenizer or Azure OpenAI and OPENAI API via python library

I’m in the same exact boat, trying to figure out the root cause of this.

I’ve tried various parameters, etc. Using semantic search, with only top 5 or top 3 documents, and asking a basic (5-10 token question along with a 100 token system prompt) - My responses from Azure Open AI are telling me I am using 6000 prompt tokens on average (GPT4, 8K).

After doing some chunking on my data (200 token chunks), I was able to reduce prompt tokens down to 4,000.

Still, this just seems extremely high and I cannot pin point what I am doing wrong.

Is this what you are also seeing?