Consumption of ada 002 v2 (embedding)

I’m using langchains and the OpenAI API to create a virtual assistant that responds to a PDF, which I converted to Markdown to improve the AI’s reading. I split it into 512-chunk documents using langchain’s RecursiveCharacterTextSplitter. It’s a small text with no spaces, approximately 400 lines. And even so, text-embedding-ada-002-v2 has a very high consumption, approximately 12,000 Tokens per request. I’ve been trying to lower this consumption for some time, but I can’t find much information about it. It’s my first time working with AI and I believe that the consumption shouldn’t be so excessive.

I am tokenizing the text with GPT2Tokenizer and splitting it with RecursiveCharacterTextSplitter

tokenizer = GPT2TokenizerFast.from_pretrained(“gpt2”, clean_up_tokenization_spaces=True)

def tokens(text: str) → int:
return len(tokenizer.encode(text))

splitter = RecursiveCharacterTextSplitter(
chunk_size = 512,
chunk_overlap = 24,
length_function = tokens,
)
Then I create the vectors using OpenAIEmbeddings and FAISS to locate them more easily, and then search for response similarity.

embeddings = OpenAIEmbeddings()
db = FAISS.from_documents(chunks, embeddings)

Is it possible to reduce this expense?

Sounds like a langchain problem to me.

At first I thought the problem was an excessive spending of tokens by the API. But after researching a little more, I discovered that there appears to be a bug that limits the generation of tokens from incorporations, which should only happen on the free account.

Even so, I was able to study and develop ways to reduce the generation of tokens.