Hello,
I am planning the architecture for a semantic search model using LLM on my custom data through RAG. I needed some help the stack i should use to build this that would be best for the use case as well as costs. I am lost with seeing lot of models like Cohere, azure apart from OpenAI or using embeddings from Pinecone or Mongo.
I will try my best to explain my use case below:
I have a fixed set of SKUs and a data of keywords around these SKUs that highlight its usage / keywords that are suited for the industry.
A typical flow would be:
User searches on my website about what he is looking for (perhaps with keywords that signal the industry or usage)
LLM suggests what SKUs may suit based on the industry or intended application and generate multiple options with the available SKUs.
I would want the above options generated in json format so that i can call external apis to check availability and pricing of those options
I would also like to ensure that the LLM gets the regional context of certain words based on the location of the user (this won’t be multi-lingual but certain words in English may phonetically sound similar to a word in regional language). If i pass it as a data point in RAG, would it be able to handle it?
thank you for reverting.
Its not a keyword search exactly. Those keywords are just to help ensure that the specific SKU is captured when something similar is searched.
Although, your question makes me wonder if adding those keywords is necessary in the first place if the LLM model can do its semantic search.
Like for example a search term that perhaps contains a keyword “handmade” may be a boutique.
In other words i am trying to find the type of industry or business from a search.
Now i have data of where a particular product would be more suitable (i.e. keywords)
Hence, need AI / LLM based approach here. Now i wonder if LLM needs this data in the first place - maybe a trial and error approach that i will have to figure out.
Another thing is this data will not require constant updation. It maybe that this data is changed once in a month initially and later even lesser periods mainly to optimize for output.
What i have done so far is upload that data in a pdf to my custom GPT and the output i received was very good.