Just to make sure that I understand your 4) example,
- you built an embedding DB with your “domain/specific knowledge”
- you take the user input, embed it and run it through the vector db
- you collect the response of this query that you use as context for the prompt
- you send the prompt enriched by the vector DB result to the LLM
Is this what you’re doing?
Additional question. is this “persistent” i.e the more query you use, the more the LLM performs well with your queries?
Thanks in advance