Different prompt tokens betwen OpenAI tokenizer or Azure OpenAI and OPENAI API via python library

Hi!

I’m testing the option “bring our own data” to chatGPT and I notice the number of prompt tokens are different between OpenAI Tokenizer or Azure OpenAI and when I using the OpenAI python library (openai==1.7.0 or openai==0.27.7), via API the usage return more 4x or 5x times prompt tokens.

I define the connector AzureCognitiveSearch to search in my documents. Follow my example for chat completion:

message_text = [{“role”:“system”,“content”:“You are an AI assistant that helps people find information.”}, {“role”:“user”,“content”:myQuestion}]

completion = openai.ChatCompletion.create(
engine=“gpt-35-turbo”,
messages = message_text,
temperature=0.7,
max_tokens=800,
top_p=0.95,
frequency_penalty=0,
presence_penalty=0,
stop=None,
extra_body={
“dataSources”: [
{
“type”: “AzureCognitiveSearch”,
“parameters”: {
“endpoint”: ###My Azure AI Search Endpoint,
“key”: “XXXXXXXXXXX”,
“indexName”: “XXXXXXXX-vector-index”,
“embeddingDeploymentName”: “text-embedding-ada-002”,
“embeddingEndpoint”: ###My Azure OpenAI Endpoint ,
“embeddingKey”: “XXXXXXXXXX”,
“fieldsMapping”: {
“contentFields”: [“Content”],
“titleField”: “title”,
“urlField”: “HTML_URL”,
“vectorFields”: [“ContentVector”]
},
“queryType”: “vectorSemanticHybrid”,
“semanticConfiguration”: “XXXXXXXX-semantic-config” ,
“inScope”: False,
“topNDocuments”: 5,
“strictness”: 3
}
}
]
}
)

Does anyone have the same problem or any clues as to why this happened?
I try to check what was passed from the “AzureCognitiveSearch” connector, but I don’t find relevant information.

Thanks!

I’m in the same exact boat, trying to figure out the root cause of this.

I’ve tried various parameters, etc. Using semantic search, with only top 5 or top 3 documents, and asking a basic (5-10 token question along with a 100 token system prompt) - My responses from Azure Open AI are telling me I am using 6000 prompt tokens on average (GPT4, 8K).

After doing some chunking on my data (200 token chunks), I was able to reduce prompt tokens down to 4,000.

Still, this just seems extremely high and I cannot pin point what I am doing wrong.

Is this what you are also seeing?