Convert Natural language to SQL query using LLM(GPT-3.5-Turbo)

While converting natural language to an SQL query, it is important to determine whether the question contains an SQL query or not. How do we identify whether the text contains an SQL query or not?
P.S: I tried using regex, but did not get satisfactory results.


Could you explain your use-case? If the conversion is natlan - SQL, why would there be sql in the query (natlan)?

Consider doing a zero-shot classification with another api call (eg with GPT 3.5)?

For example:

“Your task is to determine if the given text contains a full SQL query or parts of it. You will reply “1” if yes, and “0” otherwise”

You can also considering constraint the output to be a simple binary 1/0 by adjusting max_tokens to be 1 and set the logit_bias to be 100 for both the token IDs for 0 & 1.

1 Like

When I send an SQL query to the LLM, it should not accept it. The system should only take user questions and convert those questions into SQL. However, if I provide an SQL query as a question, my model is currently still executing it.

Suppose If the user tries to send a query to LLM like “select customer_name from customer_table” I must block this query.

Which prompt and model params are you using?

Yes, you could do something like this.

from openai import OpenAI
client = OpenAI()

def check_input(input):
    response =
        {"role": "system", "content": f"Your task is to determine if the given text contains a full SQL query or even part of it. If so, return `1`. Else, return `0`"},
        {"role": "user", "content": input},
        logit_bias={"15": 100, #token ID for `0` 
                    "16": 100})  #token ID for `1`
    return int(response.choices[0].message.content)