Best Practice to save money on Calling Assistant API

Langchain would way a good way to go about this. If the files will keep changing and you might have to create embeddings frequently, would be a good solution to use langchain.

However, if the files are static, you oculd use a database like Pinecone to store them long term.

For similarity once you have embeddings, cosine similarity is the way to go