Chatbot creation on private data

I want to build a chatbot that answers questions based on all the private data I feed and returns the relevant documents if searched. Can anyone please suggest open-source LLMs to be integrated in private data applications? I have checked out langchain, LLAMA2, LLAMA index, and privateGPT.

You would need two types of AI that are both under your control or which have a data security policy that is agreeable. (OpenAI can accommodate several requirements)

  • the language AI that you converse with, which is given the supplementary data with which it can answer
  • the AI embeddings engine that can enable semantic search on a database of your private documents

The former requires a significant GPU server hardware outlay for performance anywhere near OpenAI models, using large capable alternate engines.