I’m trying to upload an array of texts to the OpenAI Embedding API using the text-embedding-ada-002 model, which should have a token limit of 8191, but it sometimes tells me I have gone over the limit even though I am not.
I’m currently on ruby, so I’m using the tiktoken_ruby gem to count tokens before sending out the batched request.
I even tried lowering the token size of each array to be way lower, like 5000 tokens, but it’s telling me the size of tokens I sent is 8500+.
Am I correctly assuming that the token limit for the batched request is cumulative of every element in the array? and not individual elements in an array?
Here is the ruby code I am using to perform the task:
vector_response = openai_client.embeddings(
parameters: {
model: "text-embedding-ada-002",
input: batched_inputs
}
)
batched_inputs
is an array of code in string format
and this is the response I get:
{"error"=>{"message"=>"This model's maximum context length is 8191 tokens, however you requested 8589 tokens (8589 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.", "type"=>"invalid_request_error", "param"=>nil, "code"=>nil}}
I’m guessing that the token counting on tiktoken_ruby is wrong? I am using the encoding for text-embedding-ada-002 like so for counting.
module TikToken
extend self
DEFAULT_MODEL = "text-embedding-ada-002"
def self.count(string, model: DEFAULT_MODEL)
get_tokens(string, model: model).length
end
def get_tokens(string, model: DEFAULT_MODEL)
encoding = Tiktoken.encoding_for_model(model)
tokens = encoding.encode(string)
tokens.map do |token|
[token, encoding.decode([token])]
end.to_h
end
end
Any help to point me in the right direction would be appreciated.