Hi Everyone,
I’m an enthusiast AI developer currently learning and implementing. I have developed a few RAG models using the Assistant API and built sample ones using Langchain and Langflow as well.
I have become too comfortable with the Assistant API because of the ease of integration it provides, but I need more control over my database and everything. So I have started building an advanced version of a RAG model using Langchain, OpenAI API, and Chroma DB. I want to implement function calling, file search, and vision as well. I’m mostly using GPT-4o and o1 models.
On another end, I’m building a prototype for a web scraping agent where I provide HTML content of the page and the website’s full screenshot to analyze what components are present on that webpage. I have already built a POC that works for 1 or bulk URLs. I have used the Assistant API for this but am now integrating it with Langchain. This will be a separate agent using the 4o model. The problem right now is that it’s kind of difficult to feed in a huge amount of extracted HTML to the LLM. That’s why I’m chunking it down and generating a report, which is working fine. But is there any other way I can implement it?
I need your help with some suggestions, reference resources, or codebases I can refer to in order to build an advanced version of a RAG model using Langchain.
Your help will be really appreciated.