Prompt tokens seem way over counted for embeddings completions

ewarren · May 8, 2023, 8:52pm

I followed the embeddings tutorial, which uses text-davinci-003 for completions and ada-002 for embeddings.

When I increase the “max_tokens” I get this error message:

This model’s maximum context length is 4097 tokens, however you requested 4605 tokens (1605 in your prompt; 3000 for the completion). Please reduce your prompt; or completion length.

When I put my prompt into OpenAI’s tokenizer it tells me my prompt is 15 tokens (66 characters). Is it counting something else in the prompt token count?

Thank you!

PaulBellow · May 8, 2023, 10:19pm

Try turning the max_tokens down … It takes it literally and won’t work unless max_tokens + prompt = the limit…

ewarren · May 8, 2023, 11:49pm

Yes, I can get it to work if I turn the max tokens down, I’m just perplexed at how it’s arriving at its prompt token value. It seems ~10x what it should be.

PaulBellow · May 9, 2023, 1:05am

It doesn’t do the math on its end, so you have to send it right numbers.

What’s your prompt look like? Have you counted it in the playground or using another tokenizer?

ETA: Ah, I see what you’re saying… Are you appending old messages to a chat endpoint maybe?

ewarren · May 9, 2023, 5:02pm

I don’t think I am… maybe the context used to answer the question goes toward the token count?

“”"
Create a context for a question by finding the most similar context from the dataframe
“”"

Get the embeddings for the question
q_embeddings = openai.Embedding.create(input=question, engine='text-embedding-ada-002')['data'][0]['embedding']

Topic		Replies	Views
Clarification for max_tokens API codex	10	53889	December 12, 2023
How can I adjust the length of the prompt so that it does not exceed the max tokens? API api	4	1983	December 18, 2023
Encountered maximum token exceed exception via API call API	4	3059	December 18, 2023
Question regarding max_tokens Prompting	11	28345	December 13, 2023
Token Limitization Error when prompting Prompting chatgpt , api	8	1301	December 6, 2023

Prompt tokens seem way over counted for embeddings completions

Get the embeddings for the question

Related Topics