Understand user question and intent to query data

torman119 · May 11, 2023, 1:39pm

Hi, we want to allow users to ask questions about the huge data that we have in our backend. For instance we have customers, orders, products etc… Our data is sensitive and cannot leave the company.

Users want to ask questions like
How many customers ordered product “XYZ” in the last month?

Ideally we would get back from OpenAI something like
Intention: COUNTING
Entities: CUSTOMER_ORDERS
Filter: PRODUCT=XYZ, TIMEFRAME=lastmonth

We would use the result to query the backend and show it to the user.

Question: How can we instruct OpenAI to “understand” such questions and know how our data is structured?

P.S.: We cannot simply convert this into SQL query because of how the data is stored.

bill.french · May 11, 2023, 1:55pm

I would investigate the embeddings path.

Imagine vectorizing every element of your data.
Vectorize each user’s query as it occurs.
Perform a semantic similarity match and aggregate the top-most related data.
Build a prompt that uses the highly-relevant data in a learner shot and use GPT completions to generate the answer for the user.

This is a very pixelated view of the approach. A lot of software engineering goes along with this brief outline. However, it works, and it is cost-effective at scale. This approach also creates opportunities for more innovation and defenses against hallucinations.

Topic		Replies	Views
Has anyone successfully used OpenAI to interpret data sets? Prompting	7	2085	December 18, 2023
Training a GPT model to answer questions on an e-commerce dataset API	0	272	July 11, 2024
Vector database QnA answering based on info from multiple replies Prompting chatgpt	4	2851	September 25, 2023
Using OpenAI to search database for products API	12	5593	November 21, 2023
Converting natural language to SQL query API api , open-llm	9	18109	December 18, 2023

Understand user question and intent to query data

Related topics