I followed the embeddings tutorial, which uses text-davinci-003 for completions and ada-002 for embeddings.
When I increase the “max_tokens” I get this error message:
This model’s maximum context length is 4097 tokens, however you requested 4605 tokens (1605 in your prompt; 3000 for the completion). Please reduce your prompt; or completion length.
When I put my prompt into OpenAI’s tokenizer it tells me my prompt is 15 tokens (66 characters). Is it counting something else in the prompt token count?
Try turning the max_tokens down … It takes it literally and won’t work unless max_tokens + prompt = the limit…
Yes, I can get it to work if I turn the max tokens down, I’m just perplexed at how it’s arriving at its prompt token value. It seems ~10x what it should be.
It doesn’t do the math on its end, so you have to send it right numbers.
What’s your prompt look like? Have you counted it in the playground or using another tokenizer?
ETA: Ah, I see what you’re saying… Are you appending old messages to a chat endpoint maybe?
I don’t think I am… maybe the context used to answer the question goes toward the token count?
Create a context for a question by finding the most similar context from the dataframe
Get the embeddings for the question
q_embeddings = openai.Embedding.create(input=question, engine='text-embedding-ada-002')['data']['embedding']