Welcome to the community, @brandojazz!
I agree with @anon22939549.
Fine-tuning is recommended to save on prompt tokens for a high volume of calls for specific use case(s) or to set model behavior.
For knowledge augmentation and retrieval, embeddings is the go-to approach.
Here’s the documentation on embeddings.
Currently, GPT models have a finite context length, which limits the number of tokens (prompt + completion
) they can handle.
To overcome this, you can pass an outline of your repository to the model and give the model access to “see” the code, similar to how Advanced Data Analysis does on ChatGPT or like the open-interpreter
does locally.
Once the model selects the file(s) to be used with RAG, you can obtain the embeddings and look for semantically similar chunks to be passed as context.