Different prompt tokens betwen OpenAI tokenizer or Azure OpenAI and OPENAI API via python library

_JRMNFRTCM · January 10, 2024, 11:55am

Hi!

I’m testing the option “bring our own data” to chatGPT and I notice the number of prompt tokens are different between OpenAI Tokenizer or Azure OpenAI and when I using the OpenAI python library (openai==1.7.0 or openai==0.27.7), via API the usage return more 4x or 5x times prompt tokens.

I define the connector AzureCognitiveSearch to search in my documents. Follow my example for chat completion:

message_text = [{“role”:“system”,“content”:“You are an AI assistant that helps people find information.”}, {“role”:“user”,“content”:myQuestion}]

completion = openai.ChatCompletion.create(
engine=“gpt-35-turbo”,
messages = message_text,
temperature=0.7,
max_tokens=800,
top_p=0.95,
frequency_penalty=0,
presence_penalty=0,
stop=None,
extra_body={
“dataSources”: [
{
“type”: “AzureCognitiveSearch”,
“parameters”: {
“endpoint”: ###My Azure AI Search Endpoint,
“key”: “XXXXXXXXXXX”,
“indexName”: “XXXXXXXX-vector-index”,
“embeddingDeploymentName”: “text-embedding-ada-002”,
“embeddingEndpoint”: ###My Azure OpenAI Endpoint ,
“embeddingKey”: “XXXXXXXXXX”,
“fieldsMapping”: {
“contentFields”: [“Content”],
“titleField”: “title”,
“urlField”: “HTML_URL”,
“vectorFields”: [“ContentVector”]
},
“queryType”: “vectorSemanticHybrid”,
“semanticConfiguration”: “XXXXXXXX-semantic-config” ,
“inScope”: False,
“topNDocuments”: 5,
“strictness”: 3
}
}
]
}
)

Does anyone have the same problem or any clues as to why this happened?
I try to check what was passed from the “AzureCognitiveSearch” connector, but I don’t find relevant information.

Thanks!

miked1 · February 23, 2024, 8:44pm

I’m in the same exact boat, trying to figure out the root cause of this.

I’ve tried various parameters, etc. Using semantic search, with only top 5 or top 3 documents, and asking a basic (5-10 token question along with a 100 token system prompt) - My responses from Azure Open AI are telling me I am using 6000 prompt tokens on average (GPT4, 8K).

After doing some chunking on my data (200 token chunks), I was able to reduce prompt tokens down to 4,000.

Still, this just seems extremely high and I cannot pin point what I am doing wrong.

Is this what you are also seeing?

_JRMNFRTCM · February 28, 2024, 9:19am

Hi!

It’s the same experience I have!

The number of prompt tokens is exaggeratedly huge when I use the AzureCognitiveSearch extension.

The workaround I used to overcome the problem was to first search for the data using a hybrid search in AI Search and then pass the result to the OpenAI prompt.
This way I don’t need to use the AzureCognitiveSearch extension and the prompt tokens have been significantly reduced.

miked1 · February 28, 2024, 7:49pm

Wow, thanks for the insight here. I hope Azure can address this issue as it seems significant

smoh · April 16, 2024, 8:58am

Same experience here.

The number of 'prompt tokens ’ are extremely high when using the AzureCognitiveSearch extension.

Will try the suggested workaround from _JRMNFRTCM above.

Any insight as to why this is happening or if Microsoft plans to address it is appreciated.

Topic		Replies	Views
Prompt tokes are much lower than the number mentioned in the response API	6	71	January 10, 2025
Response being cut off in Azure OpenAI API	6	2138	January 30, 2024
Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality API gpt-4 , chatgpt	27	15524	February 14, 2024
Token Optimization Question API	2	1349	May 11, 2023
Optimizing Token Utilization for GPT-4 with Vector Database: Overcoming 1000-Token Limit Challenges Community gpt-4 , api , assistants-api	2	363	October 9, 2024

Different prompt tokens betwen OpenAI tokenizer or Azure OpenAI and OPENAI API via python library

Related topics