The most straightforward way is to make direct use of vector stores - and not even have a search tool, just use input context RAG injection based on the context of what is being asked and some rewriting.
A new guide just for you:
Of course better is to use embeddings yourself - but also a higher hurdle than a service provided by the same provider as you language model.